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Gravitational lensing has become one of the most interesting tools to study the mass distribution 
in the Universe. Since gravitational light deflection is independent of the nature and state of 
the matter, it is ideally suited to investigate the distribution of all (and thus also of dark) 
matter in the Universe. Lensing results have now become available over a wide range of scales, 
from the search for MACHOs in the Galactic halo, to the mass distribution in galaxies and 
clusters of galaxies, and the statistical properties of the large-scale matter distribution in the 
Universe. Here, after introducing the concepts of strong and weak lensing, several applications 
are outlined, from strong lensing by galaxies, to strong and weak lensing by clusters and the 
lensing properties of the large-scale structure. 



1. Introduction 

Light rays are deflected in gravitational fields, just like massive particles are. Hence, 
the deflection of light probes the gravitational field, and therefore the matter distribution 
that causes it. Since the field is independent of the state and nature of the matter 
generating it, it provides an ideal tool for studying the total (that is, luminous and dark) 
matter in cosmic objects. As we shall see, gravitational light deflection is used to study 
cosmic mass distributions on scales ranging from stars to galaxies, and from clusters of 
galaxies to the large-scale matter distribution in the Universe. In this contribution, I 
will concentrate on those aspects which are of particular relevance for learning about the 
dark matter distribution in the Universe. 

Gravitational lensing describes phenomena of gravitational light deflection in the weak- 
field, small deflection limit; strong-field light deflection (important for light propagation 
near black holes or neutron stars) are not covered by gravitational lens (hereafter GL) 
theory. The basic theory of gravity, and of light propagation in a gravitational field is 
General Relativity, which says that photons travel along null geodesies of the spacetime 
metric (these are described by a second-order differential equation) . In GL theory, several 
simplifications apply, owing to restriction to weak fields, and thus small deflections. We 
shall see the convenience of those further below. 

Gravitational lensing as a whole, and several particular aspects of it, has been reviewed 
previously. Two extensive monographs (Schneider, Ehlers & Falco 1992, hereafter SEF; 
Petters, Levine & Wambsganss 2001, hereafter PLW) describe lensing in all depth, in 
particular providing a derivation of the gravitational lensing equations from General 
Relativity. Fort & Mellier (1994) describe the giant luminous arcs and arclets in clusters 
of galaxies (see Sect. 5.3), Paczynski (1996) and Roulet & Mollerach (1997) review the 
effects of gravitational microlensing in the Local Group, whereas the reviews by Narayan 
& Bartelmann (1999) and Wambsganss (1998) provide a concise and didactical account 
of GL theory and observations. Much of this contribution will be focused on weak 
gravitational lensing, which has been reviewed recently by Mellier (1999), Bartelmann & 
Schneider (2001), Wittman (2002), van Waerbeke & Mellier (2003) and Refregier (2003). 
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Figure 1. The radio source JVAS B1938+666 shows two radio sources (contours in the right 
panel), one of which is mapped into four components, the other shows a double image; further- 
more, the outer radio contours merge into an arc around the lensing galaxy. The underlying 
grey-scale figure, and the left panel, shows a near-IR iamge of the field, revealing the lens galaxy, 
as well as the Einstein-ring image of the galaxy hosting the radio- AGN (from King et al. 1998) 

2. Basics of gravitational lensing 

2.1. Very brief history of lensing 

The investigation of gravitational light deflection dates back more than 200 years to 
Mitchel, Cavendish, Laplace and Soldner (see SEF, PLW for references and much more 
detail) . At that time a metric theory of gravity was not known, and light was treated as 
massive particles moving with the velocity of light. General Relativity, finalized in 1915, 
predicts a deflection angle twice as large as 'Newtonian' theory, and was verified in 1919 
by measuring the deflection of light near the Solar limb during an eclipse. Soon after, the 
'lens effect' was discussed by Lodge, Eddington and Chwolson, i.e. the possibility that 
light deflection leads to multiple images of sources behind mass concentrations, or even 
yields a ring-like image. Einstein, in 1936, considered in detail the lensing of a source 
by a star (or a point-mass lens), and concluded that the angular separation between the 
two images would be far too small (of order milliarcseconds) to be resolvable, so that 
"there is no great chance of observing this phenomenon" . In 1937, Zwicky, instead of 
looking at lensing by stars in our Galaxy, considered "extragalactic nebulae" (nowadays 
called galaxies) as lenses. He noted that they produce angular separations than can 
be separated with telescopes. Observing such an effect, he noted, would furnish an 
additional test of GR, would allow one to see galaxies at larger distances (due to the 
magnification effect), and to determine the masses of these nebulae acting as lenses. He 
furthermore considered the probability of such lens effects and concluded that about 1 
out of 400 distant sources should be affected by lensing, and therefore predicted that 
"the probability that nebulae which act as gravitational lenses will be found becomes 
practically a certainty". His visions were right on (nearly) all accounts. 

In the mid-1960's, Klimov, Liebes, and Refsdal independently formulated the basic 
theory of gravitational lensing, and focused on astrophysical applications, like determi- 
nation of masses and cosmological parameters. For example, Refsdal noticed that the 
light travel time along the two rays corresponding to two images of a source, is propor- 
tional to the size of the Universe, thus to Hq , and that a measurement of the time delay, 
possible if the source varies intrinsically, would allow the determination of the Hubble 
constant. 

In 1979, the first GL system was discovered by Walsh, Carswell & Weymann: The 
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Figure 2. The redshift z = 0.18 cluster Abell 2218 displays an enormously rich structure of 
arcs, highly stretched images of background galaxies which curve around the main cluster center 
(seen to the right of the image center), but also around the secondary mass peak near the left 
edge of this WFPC2QHST image. Together with the identification of several multiply-imaged 
background sources, these lensing phenomena have yielded a very detailed mass map of the 
inner part of this cluster (Courtesy J. -P. Kneib) 



two images of the QSO 0957+561 are separated by about 6 arcseconds, having identical 
colors, redshifts (z s = 1.41) and spectra; both images are radio sources with a core-jet 
structure on milli-arcsec scales. Soon thereafter, a galaxy situated between the two quasar 
images was detected, with redshift z<j = 0.36, being member of a cluster. 1980 marks the 
discovery of the first GL system with four QSO images, QSO 1115+080, two of which 
are very closely spaced. 1986 saw the discovery of a new lensing phenomenon, which had 
been predicted long before: the detection of a radio ring, in which an extended radio 
source is mapped into a complete ring by an intervening galaxy (see also Fig. 1). Such 
Einstein rings turn out to yield the most accuracte mass determinations in extragalactic 
astronomy. At present, some 72 multiple-image systems are known where the major 
lens component is a galaxy, including about 46 doubles, 20 four-image systems, but also 
one 3-image, one 5-image, and one 6-image system. The first of these were discovered 
serendipitously, but since the 1990's, large systematic searches for such systems were 
successfully conducted in the optical and, in particular, radio wavebands. 

In 1986, a new lensing phenomenon was discovered by two independent groups: strongly 
elongated, curved features around two clusters of galaxies. Their extreme length-to- width 
ratios made them difficult to interpret; the measurement of the redshift of one of them 
placed the source of the arc at a distance well behind the corresponding cluster. Hence, 
these giant luminous arcs are images of background galaxies, highly distorted by the tidal 
gravitational field of the cluster. By now, many clusters with giant arcs are known and 
investigated in detail, for which in particular the high-resolution of the HST was essen- 
tial. Less extremely distorted images of background galaxies have been named arclets 
and can be identified in many clusters. 

If some sources are so highly distorted as those seen in Fig. 2, one expects to see many 
more sources which are distorted to a much smaller degree - such that they can not be 
identified individually as lensed images, but that nearby images are distorted in a similar 
way, so that the distortion can be identified statistically. This forms the basis of weak 
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lcnsing; coherent image distortions around massive clusters were detected in the early 
1990's by Tyson and his group. As shown by Kaiser & Squires, there image distortions 
can be used to obtain a parameter-free mass map of clusters. Further weak lensing 
phenomena, such as galaxy-galaxy lensing, have been detected in the last decade; the 
weak lensing effect by the large-scale matter distribution in the Universe, the so-called 
cosmic shear, was discovered by several independent teams simultaneously in 2000, and 
this has opened up a new window in observational cosmology. 

Last but not least, gravitational microlensing in the local group has been suggested 
in 1986 by Paczynski as a test of whether the dark matter in the halo of our Galaxy 
is made up of compact objects; the first microlensing events were discovered in 1993 
by three different groups. I refer to the lectures of Prof. Sadoulet for a discussion of 
microlensing and the results concerning the dark matter in our Milky Way. 

2.2. Deflection angle and lens equation 

We shall provide here the basic lensing relations; the reader is encouraged to refer to one 
of the reviews or books listed in the introduction for a full derivation of these relations. 

2.2.1. Deflection by a 'point mass' M 

Consider the deflection of a light ray by the exterior of a spherically symmetric mass M; 
from the Schwarzschild metric one finds that a ray with impact parameter £ is deflected 
by an angle 



AGM 2R S 



Einstein deflection angle; R s : Schwarzschild radius (2.1) 



valid for i? s /£ "C 1, or a < 1; note that this also implies that $/c 2 <C 1, where $ is 
the Newtonian gravitational potential. The value for a is twice the 'Newtonian' value 
derived by Soldncr and others and was verified during the Solar Eclipse 1919! 

2.2.2. Deflection by a mass distribution 

Since the field equations of General Relativity can be linearized if the gravitational field 
is weak, the deflection angle caused by an extended mass distribution can be calculated 
as the (vectorial) sum of the deflections due to its individual mass elements. If the 
deflection angle is small (which is implied by the weak-field assumption), the light ray 
near the mass distribution will deviate only slightly from the straight, undeflected ray. In 
this ('Born') approximation, valid if the extent of the mass distribution is much smaller 
than the distances between source, lens, and observer (the 'geometrically thin lens'), the 
deflection angle depends solely on the surface mass density £(£), defined in terms of the 
volume density p(f) as 

E(£) = J dr 3 p(6,6,r 3 ) , (2.2) 

with the r 3 -direction along the line-of-sight. Superposing the deflections by the mass 
elements of the lens, one obtains the deflection angle 



a" 



i-i 



'12 



(2.3) 



The geometrically-thin condition is satisfied in virtually all astrophysically relevant situa- 
tions (i.e. lensing by galaxies and clusters of galaxies), unless the deflecting mass extends 
all the way from the source to the observer, as in the case of lensing by the large-scale 
structure. The relevant deflections are small, e.g., a ^ 1" for galaxies, a ^ 30" for 
clusters. 
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2.2.3. The lens equation 

The lens equation relates the true position of the source to its observed position; 
we define the lens and source plane as planes perpendicular to the line-of-sight to the 
deflector, at the distance D d and D s of the lens and the source, respectively (see Fig. 
3). Furthermore, we define the 'optical axis' as a 'straight' line through the lens center 
(the exact definition does not matter; any change of it represents just an unobservable 
translation in the source plane), and its intersections with the lens and source planes as 
their respective origins. Denoting £ as the two-dimensional position of the light ray in 
the lens plane and rj as the position of the source (see Fig. 3), then from geometry, 

V=^ L H-D d MZ)- (2-4) 
J->d 

Note that the distances D occuring here are the angular-diameter distances, since they 
relate physical transverse separations to angles. If 9 denotes the angle of a light ray 
relative to the optical axis, (3 as the angular position of the unlensed source (see Fig. 3), 



r, = D s (3 ; £ = D d , 



then 



D, 



(3 = 0- — a{D d 6) = 6 a(0) 

Us 



(2.5) 



(2.6) 



where a(8) is the scaled deflection angle, which in terms of the dimensionless surface 
mass density 



k(0) :- 



Z(D d 0) 



with 



AttG D d D ds 



(2.7) 
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reads as 



(2.8) 



Note that the critical surface mass density E cr depends only on the distances. Lenses 
with k ~ 1 at some points are called strong lenses, and those with k -C 1 everywhere are 
weak lenses. The lens equation (3 = 6 — a(8) is a mapping — ► /3 from the lens plane 
to the source plane; but in general, this mapping is non-invertable: for a given source 
position (3, the lens equation can have multiple solutions 9 which correspond to multiple 
images of a source at (3. 

2.2.4. Deflection and Fermat potentials 

Since Vln|0| = 6/\6\ 2 , the deflection angle can be written as 



a = \7ijj 



with 



ip(9) = - [ dV k{0') ln|0-0'| 



(2.9) 



being the deflection potential; hence, the lens equation describes a gradient mapping. 
From V 2 ln|0| = 2n5 D (9), where <5d denotes Dirac's delta-'function', one finds the 2-D 
Poisson equation 



(2.10) 



Defining the Fermat potential 

</>(0;/3) = i(0-/3) 2 -^(0) , (2.11) 

where (3 enters as a parameter, one sees that the lens equation can be written as 

V</>(0;/3) =0 . (2.12) 

Solutions of (2.12) can then be classified, according to whether the potential <j> nas a 
minimum, maximum, or saddle point at the solution point 0. 

2.3. Effects of lensing 

2.3.1. Multiple images 

Multiple images correspond to multiple solutions 6 of the lens equation for fixed source 
position [3. For the case of a point-mass lens, 



6(0 = 



AGM 



i\ 2 



= 6- 



4GMD ds 
c 2 D d D s W 



= 6- 



e 



(2.13) 



where #e is the Einstein angle cx VM. Note that (2.13) can also be obtained from 
(2.3) and (2.6) by setting = M#d(£)- The lens equation has two solutions, one on 
either side of the lens (just solve the quadratic equation for 9), with image separation 
A6 ^ 2$e oc \J~M\. Hence, the image separation yields an estimate for the mass of the 
lens. In general, however, more complicated mass models are needed to fit the observed 
image positions in a gravitational lens system, i.e., to find a mass model and a source 
position such that (3 = 0i — a{0i) is satisfied for all images 8i. 

2.3.2. Magnification 

Gravitational light deflection conserves surface brightness; this follows from Liouville's 
theorem, noting that light deflection is not associated with emission or absorption pro- 

f Mathematically, substantially larger separations can occur if \(3\ S> 9e, but this case is 
astronomically irrelevant, as explained shortly. 
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cesses. Therefore, 1(6) = 1^ \f3(0)] , where 1(6) and 1^ (/3) denote the surface brightness 
in the image and source plane. Differential light bending causes light bundles to get dis- 
torted; for very small light bundles, the distortion is described by the Jacobian matrix 



d 2 iP(6) 



Sij mm 



k - 71 

-72 



-72 
k + 7i 



where 



7i = o(^,n - ^,22) , 72 = ^,12 



(2.14) 



(2.15) 



are the two Cartesian components of the shear (or the tidal gravitational force). For a 
small source centered on f3 = do — a(8 ): 

I(6)^I^[f3 +A(6 )-(6-6 a )} . (2.16) 

Hence, the image of a small circular source with radius r is an ellipse with semi-axes 
Ai,2 r where Ai,2 are the eigenvalues of A; the orientation of the ellipse is determined by 
the shear components 71,2- 

The area distortion by differential deflection yields a magnification (since / is un- 
changed, and flux = I x solid angle), 



S 



det.4 (1-k) 2 -| 7 | 2 



(2.17) 



with I7I = V7i + 72- Since A is different for different multiple images, the image fluxes 
are different; the observed flux ratios yield the image magnification ratios. In particular, 
if the image separation in a point mass lens system is substantially larger than 29e, 
which occurs for |/3| 3> 6e, the secondary image is very strongly demagnificd and thus 
invisible. Flux ratios can in principle be used to constrain lens models, in addition to 
the image positions, but the magnifications are affected by small-scale structure in the 
mass distribution (we shall come back to this point below), rendering them less useful 
in mass model determinations. Note that the final expression in (2.17) can have either 
sign; images with fi > have positive parity, those with [i < negative parity. Positive 
parity images correspond to extrema of the Fermat potential <j>, negative parity images 
correspond to saddle points of <f>. In the rest of this article, we will always mean the 
absolute value of the magnification when writing /j,. 

2.3.3. Shape distortions 

The image shape of extended (resolved) sources is changed by lensing, since the eigen- 
values of A are different in general; rewriting 



A(0) = (1 - k) 



1-51 
-.92 



-.92 
l + 5i 



It 



(l-K) 



reduced shear 



(2.18) 



one sees that the shape distortion is determined by the reduced shear. This in fact forms 
the basis of weak lensing. It should be noted that giant arcs cannot be described by the 
linearized lens equation, as they are too big. 

2.3.4. Time delay 

The light travel time along the various ray paths corresponding to different images is 
different in general. This implies that variations of the source luminosity will show up 
as flux variations of the different images at different times, shifted by a time delay At, 

D d D s 



At = 



cD ds 



(l + z d ) 0(0«;/3)-^;/3) 



(2). 



(2.19) 
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Figure 4. Illustration of critical curves and caustics for an elliptical mass distribution, and 
the image geometry for various source positions (taken from Narayan & Bartelmann 1999). In 
each of the two panels, the right figure shows the caustic curves, together with several different 
source positions; the corresponding image positions are displayed on the left, together with the 
critical curves. One sees that, depending on where the source is located relative to the caustics, 
different image multiplicities occur 



where 8^ and 0^ are the two image positions considered. In fact, 4>{6\ (3) is, up to 
an affine transformation, the light travel time along a ray from the source at (3 which 
crosses the lens plane at 6. Recalling that V </>(<?; f3) = was equivalent to the lens 
equation, we see from (2.19) that this is Fermat's principle in lensing: the light travel 
time is stationary at physical images. Note that At tx Hq , since all the distances D are 
oc c/ Hq; hence, a measurement of the time delay can be used to determine the Hubble 
constant, provided the lens model is sufficiently well known. We shall return to this issue 
below. 

2.3.5. General properties of lenses 

If is a smooth function, then for (nearly) every source position f3, the number of 
images is odd ('odd-number theorem'; Burke 1981). If in addition, £(£) is non-negative, 
then at least one of the images (corresponding to a minimum in light travel time) is 
magnified, /i > 1 ('magnification theorem'; Schneider 1984). The odd-number theorem is 
violated observationally: one (usually) finds doubles and quads. The missing odd image 
is expected to be close to the center of the lens, where k ^> 1 presumably, meaning that 
/Lt -C 1; hence, this central image is highly demagnified, and thus not observable. Both 
of these theorems can be generalized even to non-thin lenses (Seitz & Schneider 1992). 

The closed and smooth curves where det A{8) = are called critical curves. When 
they are mapped back into the source plane using the lens equation, the corresponding 
curves in the source plane are called caustics. The number of images changes by ±2 if 
the source position crosses a caustic; then two images appear or merge. A source close 
to, and at the inner side of, a caustic produces two closely separated and very bright 
images near the corresponding critical curve. A source close to, and on the inner side 
of a cusp has three bright images close to the corresponding point on the critical curve. 
From singularity theory, one finds that in the limit of very large magnifications, the 
two close images on either side of the critical curve have equal magnification, and thus 
should appear equally bright. Similarly, of the three images formed near a cusp, the sum 
of the magnifications of the outer two images should equal that of the middle image, 
with corresponding consequences for the flux ratios. As we shall see, these universality 
relations are strongly violated in observed lens systems, providing a strong clue for the 
presence of substructure in the mass distribution of lens galaxies. 
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3. (Strong) Lensing by galaxies 

The first lensing phenomena detected were multiple images of distant QSOs caused 
by the lensing effect of a foreground galaxy. If the lens is a massive (i.e., ~ L») galaxy, 
the corresponding image separations are ~ 1". Gravitational lens models can be used to 
constrain the mass distribution in (the inner part of) these lensing galaxies; in particular, 
the mass inside a circle traced by the multiple images (or the Einstein ring) can be 
determined with very high precision in some cases. Furthermore, as already mentioned, 
time-delays can be used to determine H$. Mass substructure in these galaxies can be 
(and has been) detected, and the interstellar medium of lens galaxies can be investigated. 
We shall describe some of these techniques and results in a bit more detail below. 

3.1. Mass determination 

To obtain accurate mass estimates, one needs detailed models, obtained by fitting images 
and galaxy positions (and fluxes). However, even without these detailed models, a simple 
mass estimate is possible: the mean surface mass density inside the Einstein radius 6*e 
of a lens is the critical surface mass density, so that 



M(0 E ) = n(D d 8 E ) 2 Z„ . (3.1) 



An estimate of #e is obtained as the radius of the circle tracing the multiple images 
(or the ring radius in case of Einstein ring images). The estimate (3.1) is exact for 
axi-symmetric lenses, and also a very good approximation for less symmetrical ones. 

3.2. Mass models 

The simplest mass model for a galaxy is that of a singular isothermal sphere (SIS), which 
is an analytic solution of the Vlasov-Poisson equation of stellar dynamics (see Binney & 
Tremaine 1987) and whose density profile behaves like p(r) oc r~ 2 , so that the surface 
mass density is given by 

£(£) = ; with a v : 1-D velocity dispersion. (3.2) 

2Gr£ 

This model is often good enough for rough estimates, in particular since the inner parts 
of the radial mass profile of galaxies seem to closely follow this relation. Multiple images 
occur for (3 < 6e, and their separation is A9 = 29e, with 

Hence, massive ellipticals create image separations of up to ~ 3", whereas for less massive 
ones, and for spirals, A9 is of order or below ~ 1". 

However, this simple mass model is unrealistic owing to its diverging density for r — > 
and its infinite mass; furthermore, it (like all axi-symmetric models) cannot account for 
the occurrence of quadruply imaged sources. More complicated models include some or 
all of these: 

• A finite core size, to remove the central divergence. Applied to observed systems, 
the models 'like' the core to be very small, in particular since the third or fifth image is 
not seen, which needs high demagnification and thus high k near the center. 

• Elliptical isodensity contours that break the axial symmetry, which is needed to 
explain 4-image systems; those cannot be produced by symmetric lens models. 

• External tidal field (shear): A lens galaxy is not isolated, but may be part of a group 
or a cluster, and in any case the inhomogeneous matter distribution between source and 
lens, and lens and observer introduces a shear (cosmic shear) of typically a few percent. 
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Figure 5. The Southern radio lobe (the Northern one is denoted by C) of the quasar MG 
1654+134 (z = 1.7, central component Q coincident with the optical QSO position), shown here 
as contours, is mapped into a complete Einstein ring by a foreground galaxy (G), indicated 
by the grayscale image. For this lens system, probably the most datailed mass analysis was 
performed - see text (image is from Langston et al. 1990) 



This external influence can be linearized over the region of the galaxy where multiple 
images occur, and this yields a uniform shear term in the lens equation. 

In fact, any realistic model of a lens consists of at least an elliptical mass distribution 
and an external shear. This then yields the necessary number of free parameters for a lens 
model: 1 for the mass scale of the lens (either the Einstein radius, or a v ), 2 for the lens 
position, 2 for the source position, 2 for the lens ellipticity (axis ratio and orientation), 
and 2 for the external shear (a two-component quantity). This can be compared to the 
number of observables. In a quad-system, one has 2x4 image positions, and the 2 
coordinates of the lens galaxy. In this case, the number of observational constraints is 
larger by one than the number of free parameters. In addition, one could use the flux 
ratios of the images (i.e., the magnification ratios) to constrain the lens model, but as 
we shall discuss below, these are not reliable constraints for fitting a macro-model. 

Modeling of strong lens systems which contain Einstein rings yield much better con- 
strained lens parameters and therefore even more accurate mass estimates; a beautiful 
example of this is the radio ring MG 1645 +134 (see Kochanek 1995; Wallington ct al. 
1996) around a foreground elliptical galaxy at z = 0.25 shown in Fig. 5. 

Results from fitting parametrized mass models to observed multiple image systems 
include the following: Many lens systems require quite a strong external shear, which 
may be explained by massive ellipticals being preferentially located in dense regions, i.e. 
in clusters or groups. The orientation of the mass ellipticity follows closely that of light 
distribution; however, this is not the case for the magnitude of ellipticity (Keeton ct 
al. 1998). From mass modelling and detailed spectroscopic studies of lens galaxies, the 
latter in combination with stellar dynamical arguments, one finds that inside the Einstein 
radius, about half the mass is dark, and half is baryonic (Treu & Koopmans 2002; 
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Koopmans & Treu 2003); therefore, in massive (lens) galaxies, baryons have strongly 
affected the mass profile, owing to their cooling and contraction. These authors have 
also shown that over the radius relevant for lensing, the profile is very well approximated 
by an isothermal one. 

This issue is closely related to the determination of the Hubble constant from measuring 
time delays in multiple image systems. At present, time delays are known for about 10 
lens systems, in some cases with an accuracy of 1%. Therefore, H a can in principle 
be obtained by (2.19). However, reality turns out to be considerably more difficult. 
The major difficulty lies in the so-called mass-sheet degeneracy, which says that the 
transformation k(6) — ► Xk(6) + (1 — A) of the surface mass density leaves the image 
positions and magnification ratios invariant, but changes the time delay by a factor 
A (Falco et al. 1985). Essentially, the bracket in (2.19) depends on the mean surface 
mass density within the annular region around the lens in which the multiple images 
are located (Kochanek 2002), and the mass-sheet degeneracy changes that value. One 
therefore requires additional information about the mass profile in galaxies. 

As mentioned above, an isothermal profile (k oc O^ 1 ) provides a reasonable fit to lens 
systems. Assuming an isothermal profile yields values of the Hubble constant of order 
H w 50km/s/Mpc, consistently for the 'simple' lens systems. This is at variance with 
the value H ~ 72km/s/Mpc obtained from the Hubble Key Project (Freedman et al. 
2001). On the other hand, the isothermal profile is at best a reasonable approximation 
to the real mass profile. Cosmological simulations yield a cuspy profile, such as the 
one found by Navarro et al. (1997; herafter NFW). These dark matter profiles are then 
modified by baryons cooling inside these halos; the larger the baryon fraction, the more 
are the dark matter profiles affected. Kochanek (2003) pointed out that in order to get 
a value of H from lensing which is compatible with that from the Hubble Key Project, 
one would need a baryon fraction as large as 20% of the total dark matter in the halo 
to cool, in order to render the central mass profiles of lenses steep enough; in effect, that 
leads to constant M/L- models within the region where the multiple images are found. 
This high fraction of cold baryons in galaxies is at odds with the local inventories of 
baryonic mass in galaxies. At present, the origin of this discrepancy is not known. 

3.3. Substructure in lens galaxies 

Whereas 'simple' lens models can usually fit the image positions very well, in most lens 
systems they are unable to provide a good fit to the flux ratios. The best known (but 
by far not worst) case is QSO 1422+231, where several groups have tried, and failed, to 
obtain a good lens model explaining image positions and fluxes. Mao & Schneider (1998) 
have provided an analytical argument, based on the universality of the lens mapping 
near cusps, why one would not expect to find a smooth model for this system. Since 
the flux ratios in this system are most reliably and accurately measured in the radio, 
absorption by the ISM in the (elliptical) lens galaxy is expected to be negligible. We 
argued that small-scale structure in the mass distribution can change the magnification, 
but leave the image position essentially unchanged; this is due to the fact that the 
deflection angle depends on first partial derivatives of the deflection potential tp, whereas 
the magnification depends on n and 7, and thus on second derivatives of ip; those are 
more strongly affected by small-scale structure. 

This effect has been known for a long time: the optical and UV radiation from QSOs 
comes from a region small enough that even stars in the lens galaxy can affect their 
magnification, whereas the corresponding deflections are of order 10 -6 arcseconds - this 
microlensing phenomenon (see, e.g., Wambsganss et al. 1990) shows up as uncorrelated 
brightness variations in the multiple images, and has clearly been detected in the QSO 



130 Peter Schneider: Gravitational lensing as a probe of structure 

2237+0305 (Schmidt et al. 2002) and in some of the other lens systems. However, the 
VLBI images of QSO 1422+231 are extended, and therefore individual stars cannot af- 
fect their magnification. But massive structures with M ^ 10 7 M Q can change their 
magnification. Recall that CDM models of structure formation actually predict the pres- 
ence of sub-halos in each massive galaxy - the (missing, since unobserved) satellites. As 
shown by Dalai & Kochanek (2002; and references therein), the statistics of mismatches 
between observed flux ratios and those predicted by simple lens models which fit the im- 
age positions in 4-image systems is in agreement with expectations from CDM satellites. 
Bradac et al. (2002) have explicitly demonstrated that the flux ratio problem in QSO 
1422+231 is cured by placing a low-mass halo near one of the QSO images. Kochanek & 
Dalai (2003) have considered, and ruled out, alternative explanations for the flux ratio 
mismatches, such as interstellar scattering. One of the signatures of mass substructure, 
first found in investigations of microlensing (Schechter & Wambsganss 2002), is that the 
brightest saddle point is expected to be affected most, in the sense that it has a high 
probability of being substantially demagnified. Kochanek & Dalai have shown that this 
particular behavior is seen in a sample of 7 quadruple image lens systems. This behavior 
cannot be explained by absorption, scattering or scintillation by the interstellar medium 
of the lens galaxy. Hence, lensing has probably detected the 'missing' satellites in galaxy 
halos; the observed flux mismatches require a mass fraction in subclumps of order a few 
percent of the total lens mass, in accordance with predictions from CDM simulations. 
Bradac ct al. (2002, 2003) have generated synthetic lens systems, using model galaxies as 
obtained from CDM simulations as deflectors, and have shown that the resulting image 
fluxes are at variance with the predictions from simple lens models, again due to the 
substructure in the mass distribution. In a few of the observed lens systems, a fairly 
massive subclump can be identified directly by its luminosity, yielding further support 
to this interpretation. 

3.4. Other properties of lens galaxies 

3.4.1. Evolution 

Early- type galaxies are known to be located on the so-called fundamental plane (FP), 
i.e., there is a relation between their central surface brightness, their effective radius 
and the velocity dispersion in these galaxies. The FP has been observed even to high 
redshifts, using early-type galaxies in high-redshift clusters; it is known to evolve with 
z, mainly due to passive evolution of the stellar population. The lens galaxies form a 
mass-selected sample of galaxies not selected for cluster membership, and it is therefore 
of great interest to see whether they also obey a FP relation. In fact, since lens galaxies 
have a well-determined mass scale (or a v , after fitting an isothermal mass model), and are 
spread over a large redshift range, they are ideal for FP research. Rusin et al. (2003) have 
found from a sample of 28 lenses that the evolution of the FP is compatible with passive 
stellar evolution, and that it favours a high redshift for the epoch of star formation in 
these galaxies. 

3.4.2. The interstellar matter in lens galaxies 

Multiple image systems provide us with views of the same source along different lines- 
of-sight. Excluding time-delay effects in connection with spectral variability, as well as 
differential magnification, spectral differences between the images can then only be caused 
by propagation effects. In particular, one can study the properties of the dust in lens 
galaxies, as color differences between images can be attributed to different extinction and 
reddening along the different lines-of-sight through the lens galaxy. Falco et al. (1999) 
have investigated 23 gravitational lens galaxies over the range ^ z& ^ 1. Given that 
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most lens galaxies are early types, they found a small median differential extinction of 
AE(B — V) <~ 0.05, with slightly larger (smaller) values for radio- (optically-)selected lens 
systems. The lack of a clear correlation with the separation of the image from the lens 
center points towards patchy extinction. Two spiral lens galaxies show a substantially 
larger extinction. The extinction law, i.e., the relation between extinction and reddening, 
varies between different lens galaxies over quite some range; the Galactic extinction law 
is therefore by no means universally applicable. 



4. Weak gravitational lensing 

Multiple images, microlensing (with appreciable magnifications) and arcs in clusters 
are phenomena of strong lensing. In weak gravitational lensing, the Jacobi matrix A is 
very close to the unit matrix, which implies weak distortions and small magnifications. 
Those cannot be identified in individual sources, but only in a statistical sense; the basics 
of these effects will be described in this section, and several applications will be dicussed 
in later sections. 



4.1. Distortion of faint galaxy images 

Images of distant, extended sources are distorted in shape and size; this is described by 
the locally linearized lens equation around the image center 6q, 

0-I3 o = A(Oo)-(O-Oo), (4.1) 

where [3 Q = f3(0 ) } with the Jacobian (2.18), and the invariance of surface brightness 
(2.16). Recall that the shape distortion is described by the (reduced) shear which is a 
two-component quantity, most conveniently written as a complex number, 

7 = 7i + 172 = M e 2iv ; 9 = 9i+ m = \g\ e 2lip ; (4.2) 

its amplitude describes the degree of distortion, whereas its phase <p yields the direction of 
distortion. The reason for the factor '2' in the phase is the fact that an ellipse transforms 
into itself after a rotation by 180°. Consider a circular source with radius r; mapped by 
the local Jacobi matrix, its image is an ellipse, with semi-axes 



l-K-M (i-«)(i-| 5 |) ' i-k+m (l-«)(l + |fl|) 

and the major axis encloses an angle <p with the positive #i-axis. Hence, if circular 
sources could be identified, the measured image ellipticities would immediately yield the 
value of the reduced shear, through the axis ratio 

l_6/a b = l-\g\ 

m l + b/a a l + \g\ 

and the orientation of the major axis if. However, faint galaxies are not intrinsically 
round, so that the observed image ellipticity is a combination of intrinsic ellipticity and 
shear. The strategy to nevertheless obtain an estimate of the (reduced) shear consists in 
locally averaging over many galaxy images, assuming that the intrinsic ellipticities are 
randomly oriented. In order to follow this strategy, one needs to clarify first how to define 
'ellipticity' for a source with arbitrary isophotes (faint galaxies are not simply elliptical); 
in addition, seeing caused by atmospheric turbulence will blur - and thus circularize - 
observed images. We will consider these issues in turn. 
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4.2. Measurements of shapes and shear 

Let 1(0) be the brightness distribution of an image, assumed to be isolated on the sky; 
the center of the image can be defined as 

fd*6 qi [I(0)]0 



o 



f<pe qi [i{9)] 



(4.3) 



where qj(I) is a suitably chosen weight function; e.g., if qi(I) = IH(I — ith)> & would be 
the center of light within a limiting isophote of the image (where H denotes the Heaviside 
step function). We next define the tensor of second brightness moments, 



Q 4 



Jd 2 e qi [i(0)] {Bi-e^idi-di) 



ue{i,2}. 



(4.4) 



/d 2 * «/[/(*)] 

Note that for an image with circular isophotes, Qii = Q22, arid Q12 — 0. The trace of Q 
describes the size of the image, whereas the traceless part of Qij contains the ellipticity 
information. 

From , one defines two complex ellipticities, 



X = 



Qn - Q22 + 2iQi 2 
Qn + Q22 



and 



e = 



Q11 - Q22 + 2iQi 2 



Oii+022 + 2(Q 11 Q 22 -Q 2 2 ) 1 /2 



(4.5) 



Both of them have the same phase (because of the same numerator), but a different 
absolute value; for an image with elliptical isophotes of axis ratio r < 1, one obtains 

1 — r 2 1 — r 

1x1 = ^-2 : \c\ = TTZ- (4-6) 



1 + r 2 ' 1 + r 

Which of these two definitions is more convenient depends on the context; one can easily 
transform one into the other, 



X 



1 + a-ixi 2 ) 1 / 2 ' 



X 



2e 



(4.7) 



4.2.1. From source to image ellipticities 

(s) 

In total analogy, one defines the second-order brightness tensor Q\ ' , and the complex 
ellipticities and e( s ) for the unlensed source. From 



( S ) = ja^ qi [&)([3)] (A - a) (ft - ft) 
w « Jd^ qi [i^)(0)} 

one finds with d 2 /3 = dct^d 2 6», 0-/3 = A(O-0) that 

d^) ^ ^ — ^4, — ^4 ^4 - 



i,je{l,2} 



(4i 



(4.9) 



where .4 = A(8) is the Jacobi matrix of the lens equation at position 9. Using the 
definitions of the complex ellipticities, one finds the transformations: 



V ( B ) = x - 25 + g x 

A 



1 + M 2 - 2ftc( ffX *) 




(4.10) 



The inverse transformations are obtained by interchanging source and image ellipticities, 
and g — > — <? in the foregoing equations. 
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4.2.2. Estimating the (reduced) shear 

In the following we make the assumption that the intrinsic orientation of galaxies is 
random, 

E (x (s) ) = = E (e( s >) , (4.11) 

which is expected to be valid since there should be no direction singled out in the Uni- 
verse. This then implies that the expectation value of e is [as obtained by averaging the 
transformation law (4.10) over the phase of the intrinsic source orientation (Schramm & 
Kaiser 1995; Seitz & Schneider 1997)] 





if \g\ < i 


E(e) = \ 9 




(w 


if \g\ > i • 



(4.12) 



This is a remarkable result, since it shows that each image ellipticity provides an unbiased 
estimate of the local shear, though a very noisy one. The noise is determined by the 
intrinsic ellipticity dispersion 



This noise can be beaten down by averaging over many galaxy images. Fortunately, we 
live in a Universe where the sky is 'full of faint galaxies', as was impressively demonstrated 
by the Hubble Deep Field images (Williams et al. 1996). Hence, the accuracy of a shear 
estimate depends on the local number density of galaxies for which a shape can be 
measured. In order to obtain a high density, one requires deep imaging observations. As 
a rough guide, on a 3 hour exposure under excellent observing conditions with a 3-meter 
class telescope, about 30 galaxies per arcmin 2 can be used for a shape measurement. 
Note that in the weak lensing regime, k<1, I7I <C 1, one finds 

7RjffRj(e)Rj M. (4.13) 

4.3. Problems in measuring shear, and their solutions 
4.3.1. Major problems 

• Seeing, that is the finite size of the point spread function (PSF), circularizes images; 
this effect is severe since faint galaxies (i.e. those at a magnitude limit for which the 
number density is of order 30 per arcmin 2 ) are not larger than the typical seeing disk. 
Therefore, weak lensing requires imaging with very good seeing. 

• The PSF is not circular, owing to e.g., wind shake of the telescope, or tracking errors. 
However, an anisotropic PSF causes round sources to appear elliptical, and thus mimics 
shear. 

• Galaxy images are not isolated, and therefore the integrals in the definition of Qij 
have to be cut-off at a finite radius. Hence, one usually uses a weight function q which 
depends explicitly on \9 — 0\; however, this modifies the transformation (4.10) between 
image and source ellipticity. 

• The sky noise, i.e. the finite brightness of the night sky, introduces a noise component 
in the measurement of image ellipticities from CCD data, so that only for high-S/N 
objects can a shape be measured. 

• Distortion by telescope and camera optics renders the coaddition of exposures com- 
plex; one needs to employ remapping, using accurate astrometry, and sub-pixel coaddi- 
tion. 

Depending on the science application, the shear one wants to measure is of order a few 
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percent, or even smaller. Essentially all of the effects mentioned can introduce ellipticities 
of the same order in the measured images if they are not carefully taken into account. 

4.3.2. Solutions 

In order to deal with these issues, specific software has been developed. The one that 
has been mostly used up to now is the Kaiser et al. (1995, KSB) method, or its imple- 
mentation IMCAT. Its basic idea is as follows: the image ellipticity will be determined 
by the intrinsic ellipticity, the (reduced) shear, the size of the PSF and its anisotropy. 
Note that the PSF can be investigated by identifying stars (that is: point sources) on 
the images. The response of x to the shear depends on the size of the source - for small 
sources, blurring by seeing reduces the response to a large degree. The size of the sources 
is estimated from the size of the seeing-convolved images. In addition, the response of 
X to a PSF anisotropy also depends on the image size. The KSB method, an essential 
part of which was put forward by Luppino & Kaiser (1997; for a complete derivation, see 
Sect. 4.6.2 of Bartclmann & Schneider 2001), results in the relation 



where \° is the ellipticity of the source convolved with the isotropic part of the PSF and 
therefore, its expectation value vanishes, according to the assumption of randomly ori- 
ented sources, E(x°) = 0. The tensor P™ describes the response of the image ellipticity 
to the PSF anisotropy, which is quantified by the (complex) ellipticity qp of the PSF, as 
measured from stars. P^ is a tensor which describes the response of the image ellipticity 
to an applied shear. Both, P™ an d P^n are calculated for each image individually; they 
depend on higher-order moments of the image brightness distribution and the size of the 
PSF. Detailed simulations (e.g., Erben et al. 2001; Bacon et al. 2001) have shown that 
the KSB method can measure shear with better than <~ 10% accuracy. Several other 
methods for measuring image shapes in order to obtain an estimate of the local shear 
have been developed and some of them have already been applied to observational data 
(Bonnet & Mellier 1995; Kuijken 1999; Kaiser 2000; Bernstein & Jarvis 2002; Refregier 
& Bacon 2003). 



The magnification caused by the differential light bending changes the apparent bright- 
ness of sources; this leads to two effects: 

(a) The observed flux S from a source is changed from its unlcnscd value So according 
to S = /i So; if > 1, sources appear brighter than they would without an intervening 
lens. 

( b) A population of sources in the unlensed solid angle cj is spread over the solid angle 
oj = /ia->o due to the magnification. 

These two effects affect the number counts of sources differently; which one of them wins 
depends on the slope of the number counts; one finds 



,obs 



(4.14) 



a 



4.4. Magnification effects 




(4.15) 



where n(> S,z) and no(> S,z) are the lensed and unlensed cumulative number counts 
of sources, respectively. The first argument of n accounts for the change of the flux, 
whereas the prefactor in (4.15) stems from the change of apparent solid angle. 
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As an illustrative example, we consider the case that the source counts follow a power 
law, 

n (> S) = aS- a ; (4.16) 
one then finds for the lensed counts in a region of the sky with magnification [i: 

n(> S) ! 



, -M , (4-17) 
n (> S) 

and therefore, if a > 1 (< 1), source counts are enhanced (depleted); the steeper the 
counts, the stronger the effect. In the case of weak lensing, where \fj, — 1| <C 1, one 
probes the source counts only over a small range in flux, so that they can always be 
approximated (locally) by a power law. 

One important example is provided by the lensing of QSOs. The QSO number counts 
are steep at the bright end, and flat for fainter sources. This implies that in regions 
of magnification > 1, bright QSO should be overdense, faint ones underdense. This 
magnification bias is the reason why the fraction of lensed sources is much higher in 
bright QSO samples than in fainter ones! 

4.5. Tangential and cross component of shear 
4.5.1. The shear components 

The shear components 71 and 72 are defined relative to a reference Cartesian coordinate 
frame. Note that the shear is not a vector, owing to its transformation properties under 
rotations, which is the same as that of the linear polarization; it is therefore called a polar. 
In analogy with vectors, it is often useful to consider the shear components in a rotated 
reference frame, that is, to measure them w.r.t. a different direction; for example, the 
arcs in clusters are tangentially aligned, and so their ellipticity is oriented approximately 
tangent to the radius vector in the cluster. 

If 4> specifies a direction, one defines the tangential and cross components of the shear 
relative to this direction as 



7t = — Ttc [7 c 2l *J , 7 X = -Im [7 e 



3 -2i0 



(4.18) 



For example, in the case of a circularly-symmetric matter distribution, the shear at any 
point will be oriented tangent to the direction towards the center of symmetry. Thus in 
this case, choose <j> to be the polar angle of a point; then, j x = 0. In full analogy to the 
shear, one defines the tangential and cross components of an image ellipticity, e t and e x . 

4.5.2. Minimum lens strength for its weak lensing detection 

As a first application of this decomposition, we consider how massive a lens needs 
to be in order to produce a detectable weak lensing signal. For this purpose, consider 
a lens modeled as an SIS with one-dimensional velocity dispersion u v . In the annulus 
#in < < #out, centered on the lens, let there be N — mr(9l ut — 6f n ) galaxy images 
with position Oi = 0j(cos0j, sin^) and (complex) ellipticities e*. For each one of them, 
consider the tangential ellipticity 

e ti = -fte(e 4 e- 2i ^) . (4.19) 

Next we define a statistical quantity to measure the degree of tangential alignment of 
the galaxy images, and thus the lens strength: 

N 

* = 5> 4 e tl , (4.20) 
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where the factors a, = a(6i) are arbitrary at this point, and will later be chosen such as 
to maximize the signal-to-noise ratio of this estimator. The shear of an SIS is given by 

The expectation value of the ellipticity is E(e ti ) = 7t(#i), so that E(X) = #e E« a i/(^i)- 
Since e ti = e[f + 7 t (0j) m tnc weak lensing regime, one has E(e ti e tJ ) = 7t(0j)7t(0j) + 
SijO-^/2, thus 

JV 2 N 

E(X 2 ) = ]T E(e ti e y ) - [E(X)] 2 + £ a 2 . ( 422 ) 
Therefore, the signal-to-noise ratio for a detection of the lens is 



(4.23) 



The a, can now be chosen so as to maximize S/N; from differentiation of S/N, one finds a 
maximum if dj oc \j9i- Then, performing the ensemble average over the galaxy positions, 
one finally obtains: 



S = 0E 
N ct £ 



V^\/ln(0out/0in) (4.24) 

1/2 



30arcmin- 2 / V0.3/ V600kms _1 
ln(0 out M„)\ 1/2 / £ ds 



8.4 



In 10 / \ D 

From this consideration we conclude that clusters of galaxies with a v ^5 600km/s can 
be detected with sufficiently large S/N by weak lensing, but individual galaxies (a v <~ 
200km/s) are too weak as lenses to be detected individually. 

4.6. Galaxy-galaxy lensing 

Whereas galaxies are not massive enough to show a weak lensing signal individually, the 
signal of many galaxies can be (statistically) superposed. Consider sets of foreground 
(lens) and background galaxies; on average, in a foreground-background galaxy pair, 
the ellipticity of the background galaxy will be oriented preferentially in the direction 
tangent to the connecting line. In other words, if <p is the angle between the major axis 
of the background galaxy and the connecting line between foreground and background 
galaxy, values tt/A < tp < tt/2 should be slighly more frequent than < ip < ir/4. The 
mean tangential ellipticity (e t (#)) of background galaxies relative to the direction towards 
foreground galaxies measures the mean tangential shear at this separation. 

The strength of this mean tangential shear measures mass properties of the galaxy 
population selected as potential lenses. In order to properly interpret the lensing signal, 
one needs to know the redshift distribution of the foreground and background galaxies. 
Furthermore, one needs to assume a relation between the lens galaxies' luminosity and 
mass properties (such as a Faber- Jackson type of relation), unless the sample is so large 
that one can finely bin the galaxies with respect to their luminosities; this requires of 
course redshift information. In this way, the velocity dispersion cr* of an L*-galaxy can 
be determined from galaxy-galaxy lensing. Furthermore, galaxy-galaxy lensing provides 
a highly valuable tool to study the mass distribution of galaxy halos at distances from 
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their centers which are much larger than the extent of luminous tracers, such as stars 
and gas - or ask the question of where the galaxy halos 'end'. 

Whereas the first detection of galaxy-galaxy lensing (Brainerd et al. 1996) was based 
on a single field with ~ 9' sidelength, much larger surveys have now become available, 
most noticibly the SDSS (Fischer et al. 2000; McKay et al. 2001). These large data sets 
have allowed the splitting of the lens galaxies into subsamples and to investigate their 
properties separately. From this it was verified that early-type galaxies have a larger 
mass than spiral galaxies with the same luminosity, and that this behaviour extends to 
large radii. Furthermore, the lensing signal for early-type galaxies can be detected out to 
much larger scales than for late-types. The interpretation of this result is not unique: it 
cither can mean that ellipticals have a more extended halo than spirals, or that the lens 
signal from ellipticals, which tend to be preferentially located inside groups and clusters, 
arises in fact from the host halo in which they reside (see also Guzik & Seljak 2002). 
Indeed, when the lens galaxy sample is divided into those living in high- and low-density 
environments, the former ones have a significantly more extended lensing signal. 

What the galaxy-galaxy signal really measures is the relation between light (galaxies) 
and mass. In its simplest terms, this relation can be expressed by a bias factor b and 
the correlation coefficient r. Schneider (1998) and van Waerbeke (1998) pointed out 
that lensing can be used to study the bias factor as a function of scale and redshift, by 
correlating the lensing signal with the number density of galaxies. In fact, as shown in 
Hockstra et al. (2002b), both b and r can be expressed in terms of the galaxy-galaxy 
lensing signal, the angular correlation function of the (lens) galaxies and the cosmic shear 
signal (see Sect. 6 below). Applying this method to the combination of the Red-Sequence 
Cluster Survey and the VIRMOS-DESCART survey, they derived the scale dependence 
of b and r; on large (linear) scales, their results are compatible with constant values, 
whereas on smaller scales it appears that both of these functions vary. Future surveys will 
allow much more detailed studies on the relation between mass and light, and therefore 
determine the biasing properties of galaxies from observations directly. This is of course 
of great interest, since the unknown behavior of the biasing yields the uncertainty in the 
transformation of the power spectrum of the galaxy distribution, as determined from 
extensive galaxy redshift surveys, to that of the underlying mass distribution. Hence, 
weak lensing is able to provide this crucial calibration. 

5. Lensing by clusters of galaxies 

5.1. Introduction 

Clusters are the most massive bound structures in the Universe; this, together with 
the (related) fact that their dynamical time scale (e.g., the crossing time) is not much 
smaller than the Hubble time so that they retain a 'memory' of their formation - render 
them of particular interest for cosmologists. The evolution of their abundance, i.e., their 
comoving number density as a function of mass and redshift, is an important probe for 
cosmological models. Furthermore, they form signposts of the dark matter distribution 
in the Universe. Clusters act as laboratories for studying the evolution of galaxies and 
baryons in the Universe. In fact, clusters were (arguably) the first objects for which the 
presence of dark matter has been concluded (by Zwicky in 1933). 

5.2. The mass of galaxy clusters 

Cosmologists can predict the abundance of clusters as a function of their mass (e.g., 
using numerical simulations); however, the mass of a cluster is not directly observable, 
but only its luminosity, or the temperature of the X-ray emitting intra-cluster medium. 
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Therefore, in order to compare observed clusters with the cosmological predictions, one 
needs a way to determine their masses. Three principal methods for determining the 
mass of galaxy clusters are in use: 

• Assuming virial equilibrium, the observed velocity distribution of galaxies in clusters 
can be converted into a mass estimate; this method typically requires assumptions about 
the statistical distribution of the anisotropy of the galaxy orbits. 

• The hot intra-cluster gas, as visible through its Bremsstrahlung in X-rays, traces the 
gravitational potential of the cluster. Under certain assumptions (see below), the mass 
profile can be constructed from the X-ray emission. 

• Weak and strong gravitational lensing probe the projected mass profile of clusters; 
this will be described further below. 

All three methods are complementary; lensing yields the line-of-sight projected density 
of clusters, in contrast to the other two methods which probes the mass inside spheres. 
On the other hand, those rely on equilibrium (and symmetry) conditions. 



5.2.1. X-ray mass determination of clusters 

The intracluster gas emits via Bremsstrahlung; the emissivity depends on the gas 
density and temperature, and, at lower T, on its chemical composition. Assuming that 
the gas is in hydrostatic equilibrium in the potential well of cluster, the gas pressure P 
must balance gravity, or 

VP = -p g V$, 

where $ is the gravitational potential and p g is the gas density. In the case of spherical 
symmetry, this becomes 

l_dP _ _d$ _ GM(r) 
p g dr dr r 2 

From the X-ray brightness profile and temperature measurement, M(r), the total mass 
inside r (dark plus luminous) can then be determined, 



M(rH _^/dJnp i + dlnT 



Gfinip \ dr dr 



(5.1) 



However, the two major X-ray satellites currently operating, Chandra & XMM-Newton, 
have revealed that at least the inner regions of clusters show a considerably more compli- 
cated structure than implied by hydrostatic equilibrium. In some cases, the intracluster 
medium is obviously affected by a central AGN, which produces additional energy and 
entropy input. Cold fronts, with very sharp edges, and shocks have been discovered, 
most likely showing ongoing merger events. The temperature and metallicity appear to 
be strongly varying functions of position. Therefore, mass estimates of central parts of 
clusters from X-ray observations require special care. 



5.3. Luminous arcs & multiple images 

Strong lensing effects in clusters show up in the form of giant luminous arcs, strongly 
distorted arclets, and multiple images of background galaxies. Since strong lensing occurs 
only in the central part of clusters (typically corresponding to ~ 50/i _1 kpc), it can be 
used to probe only their inner mass structure. However, strong lensing yields by far the 
most accurate central mass determinations; in some favourable cases with many strong 
lensing features (such as for Abell 2218; see Fig. 2), accuracies better than ~ 10% can 
be achieved. 
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5.3.1. First go: M(< 8 E ) 

Giant arcs occur where the distortion (and magnification) is very large, that is near 
critical curves. To a first approximation, assuming a spherical mass distribution, the 
location of the arc relative to the cluster center (which usually is assumed to coincide 
with the brightest cluster galaxy) yields the Einstein radius of the cluster, so that the 
mass estimate (3.1) can be applied. Therefore, this simple estimate yields the mass 
inside the arc radius. However, this estimate is not very accurate, perhaps good to 
within <~ 50%. Its reliability depends on the level of asymmetry and substructure in the 
cluster mass distribution (Bartclmann 1995). Furthermore, it is likely to overestimate 
the mass in the mean, since arcs preferentially occur along the major axis of clusters. 
Of course, the method is very difficult to apply if the center of the cluster is not readily 
identified or if it is obviously bimodal. For these reasons, this simple method for mass 
estimates is not regarded as particularly accurate. 

5.3.2. Detailed modelling 

The mass determination in cluster centers becomes much more accurate if several 
arcs and/or multiple images are present, since in this case, detailed modelling can be 
done. This typically proceeds in an interactive way: First, multiple images have to 
be identified (based on their colors and/or detailed morphology, as available with HST 
imaging). Simple (plausible) mass models are then assumed, with parameters fixed by 
matching the multiple images, and requiring the distortion at the arc location(s) to be 
strong and have the correct orientation. This model then predicts the presence of further 
multiple images; they can be checked for through morphology and color. If confirmed, 
a new, refined model is constructed, which yields further strong lensing predictions etc. 
Such models have predictive power and can be trusted in quite some detail; the accuracy 
of mass estimates in some favourable cases can be as high as a few percent. 

In fact, these models can be used to predict the redshift of arcs and arclets (Kneib et 
al. 1994): since the distortion of a lens depends on the source redshift, once a detailed 
mass model is available, one can estimate the value of the lens strength oc Dd s /D s and 
thus infer the redshift. This method has been successfully applied to HST observations 
of clusters (Ebbels et al. 1998). Of course, having spectroscopic redshifts of the arcs 
available increases the accuracy of the calibration of the mass models; they are therefore 
very useful. 

5.3.3. Results 

The main results of the strong lensing investigations of clusters can be summarized as 
follows: 

• The mass in cluster centers is much more concentrated than predicted by (simple) 
models based on X-ray observations. The latter usually predict a relatively large core 
of the mass distribution. These large cores would render clusters sub-critical to lensing, 
i.e., they would be unable to produce giant arcs or multiple images. In fact, when arcs 
were first discovered they came as a big surprise because of these expectations. By now 
we know that the intracluster medium is much more complicated than assumed in these 
'/3-model' fits for the X-ray emission. 

• The mass distribution in the inner part of clusters often shows strong substructure, 
or multiple mass peaks. These are also seen in the galaxy distribution of clusters, but 
with the arcs can be verified to also correspond to mass peaks. These are easily under- 
stood in the frame of hierarchical mergers in a CDM model; the merged clusters retain 
their multiple peaks for a dynamical time or even longer, and are therefore not in virial 
equilibrium. 
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• The orientation of the (dark) matter appears to be fairly strongly correlated with 
the orientation of the light in the cD galaxy; this supports the idea that the growth of the 
cD galaxy is related to the cluster as a whole, through repeated accretion of lower-mass 
member galaxies. In that case, the cD galaxy 'knows' the orientation of the cluster. 

• There is in general good agreement between lensing and X-ray mass estimates for 
those clusters where a 'cooling flow' indicates that they are in dynamical equilibrium, 
provided the X-ray analysis takes the presence of the cooling flow into account. 

5.4. Mass reconstructions from weak lensing 

Whereas strong lensing probes the mass distribution in the inner part of clusters, weak 
lensing can be used to study the mass distribution at much larger angular separations 
from the cluster center. In fact, as we shall see, weak lensing can provide a parameter- 
free reconstruction of the projected two-dimensional mass distribution in clusters. This 
discovery (Kaiser & Squires 1993) actually marked the beginning of quantitative weak 
lensing research. 

5.4.1. The Kaiser-Squires inversion 

Weak lensing yields an estimate of the local (reduced) shear, as discussed in Sect. 4.2. 
Here we shall discuss how to derive the surface mass density from a measurement of the 
(reduced) shear. Starting from (2.9) and the definition (2.15) of the shear, one finds that 
the latter can be written in the form 

7(0) = - / & 2 6' V(0 - 0') k(0') , with 

Hence, the complex shear 7 is a convolution of k with the kernel T>, or, in other words, T> 
describes the shear generated by a point mass. In Fourier space this convolution becomes 
a multiplication, 

j(£) = k- x V(1) k(£) for £ ^ . 
This relation can be inverted to yield 

k(£) =n- 1 ^(£)V*(£) for £^0, (5.3) 

where 

V W-« ^2 

was used. Fourier back-transformation of (5.3) then yields 



k(0)-k o = - [ d 2 6'v*(o-e') 1 (0') = - [ & 2 e"Rc \v*{6 -e l ) 1 {e')} 



■ (5-4) 



Note that the constant kq occurs since the £ = 0-mode is undetermined. Physically, this 
is related to the fact that a uniform surface mass density yields no shear. Furthermore, 
it is obvious (physically, though not so easily seen mathematically) that n must be real; 
for this reason, the imaginary part of the integral should be zero, and taking the real- 
part only makes no difference. However, in practice it is different, as noisy data, when 
inserted into the inversion formula, will produce a non-zero imaginary part. What (5.4) 
shows is that if 7 can be measured, k can be determined. 

Before looking at this in more detail, we briefly mention some difficulties with the 
inversion formula as given above: 
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• Since 7 can at best be estimated at discrete points (galaxy images), smoothing is 
required. One might be tempted to replace the integral in (5.4) by a discrete sum over 
galaxy positions, but as shown by Kaiser & Squires (1993), the resulting mass density 
estimator has infinite noise (due to the 6> _2 -behavior of the kernel V). 

• It is not the shear 7, but the reduced shear g that can be determined from the galaxy 
cllipticities; hence, one needs to obtain a mass density estimator in terms of g. 

• The integral in (5.4) extends over 1R 2 , whereas data are available only on a finite 
field; therefore, it needs to be seen whether modifications allow the construction of an 
estimator for the surface mass density from finite-field shear data. 

• To get absolute values for the surface mass density, the additive constant k is of 
course a nuisance. As will be explained soon, this indeed is the largest problem in mass 
reconstructions, and carries the name mass-sheet degeneracy (note that we mentioned 
this effect before, in the context of determining the Hubble constant from time-delays in 
lens systems). 

5.4.2. Improvements and generalizations 

Smoothing of data is needed to get a shear field from discrete data points. When 
smoothed with Gaussian kernel of angular scale 9 S , the covariance of the resulting mass 
map is finite, and given by (Lombardi & Bertin 1998; van Waerbeke 2000) 

Thus, the larger the smoothing scale, the less noise does the corresponding mass map 
have. Note that, since (i) smoothing can be represented by a convolution, (ii) the relation 
between k and 7 is a convolution, and (iii) convolution operations are transitive, it does 
not matter whether the shear field is smoothed first and inserted into (5.4), or the noisy 
inversion obtained by transforming the integral into a sum over galaxy image positions 
is smoothed afterwards with the same smoothing kernel. 

Noting that it is the reduced shear 3 = 7/(1 — k) that can be estimated from the 
cllipticity of images, one can write: 

k(0)-ko = - [ dV [1-k(0')] lle[D*(0-0')g(0')] ; (5.5) 

this integral equation for k can be solved by iteration, and it converges quickly. Note that 
in this case, the undetermined constant k no longer corresponds to adding a uniform 
mass sheet. What the arbitrary value of k corresponds to can be seen as follows: The 
transformation 

k(6) -» k'(6) = \k(6) + (1 - A) or 

[1-k'(0)]=A[1-k(0)] (5.6) 

changes the shear 7^7' = A7, and thus leaves g invariant; this is the mass-sheet 
degeneracy! It can be broken if magnification information can be obtained, since 

/j, — > A~ 2 ^ . 

Magnification can in principle be obtained from the number counts of images (Broad- 
hurst et al. 1995), owing to magnification bias, provided the unlensed number density is 
sufficiently well known. Indeed, magnification effects have been detected in a few clusters 
as a number depletion of faint galaxy images towards the center of the clusters (e.g., Fort 
et al. 1997; Taylor et al. 1998; Dye et al. 2002). In principle, the mass sheet degeneracy 
can also be broken if redshift information of the source galaxies is available and if the 
sources are widely distributed in redshift; however, even in this case it is only mildly 
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broken, in the sense that one needs a fairly high number density of background galaxies 
in order to fix the parameter A to within <~ 10%. 

Finite-field inversions start from the relation (Kaiser 1995) 

vn=(y + y Uu 7W , (5.7) 



v 72,l — 7l,2 

which is a local relation between shear and surface mass density; it can easily be derived 
from the definitions of n (2.10) and 7 (2.15) in terms of ip^j. A similar relation can be 
derived in terms of reduced shear, 

VK(0) = - - 1 2 ( 1 - * ~f )( ^ + ^ ) ee u g (0) , (5.8) 

\-g\-g\ \ -.92 I + / \92,\-g\.2 ) 

where 

K{6) ee ln[l - k(9)] (5.9) 

is a non-linear function of n. These equations can be integrated, by formulating them as 
a von Neumann boundary- value problem on the data field U (Seitz & Schneider 2001), 



V k = V • u 7 with n • Vk = n • u 7 on dU 



(5.10) 



where n is the outward-directed normal on the boundary dlA of IA. The analogous 
equation holds for K in terms of g and u 9 . The numerical solution of these equations 
is fast, using overrelaxation (see Press et al. 1992). In fact, the foregoing formulation of 
the problem is equivalent (Lombardi & Bertin 1998) to the minimization of the action 

A = [ d 2 6 |V/s(0) -u 7 (0)| 2 , (5.11) 
Ju 

from which the von Neumann problem can be derived as the Euler equation of the 
variational principle SA = 0. These parameter-free mass reconstructions have been 
applied to quite a number of clusters; it provides a tool to make their dark matter 
distribution 'visible'. 



5.4.3. Results 

The mass reconstruction techniques discussed above have been applied to quite a 
number of clusters up to now, yielding parameter-free mass maps. It is obvious that the 
quality of a mass map depends on the number density of galaxies that can be used for a 
shear estimate, which in turn depends on the depth and the seeing of the observational 
data. Furthermore, the mass profiles of clusters are much more reliably determined if 
the data field covers a large region, as boundary effects get minimized. 

The first application of the Kaiser & Squires reconstruction technique was done to the 
X-ray detected cluster MS1224+20 (Fahlman et al. 1994); it resulted in an estimate of the 
mass-to-light ratio in this z — 0.33 cluster of <~ 800 h, considerably larger than 'normal' 
values of M/L <~ 250 h. This conclusion was later reinforced by a fully independent 
weak-lensing analysis of this cluster by Fischer (1999). This mass estimate is in fact 
in conflict with the measured velocity dispersion of the cluster galaxies, which is much 
smaller than obtained by an SIS fit to the shear data. The line-of-sight to this cluster 
is fairly complicated, with additional peaks in the redshift distribution of galaxies in the 
field (Carlberg et al. 1994), all of which are included in the weak lensing measurement. 
Furthermore, this cluster may not be in a relaxed state, which probably renders the 
X-ray mass analysis inaccurate. 

In fact, non-relaxed clusters are probably more common than naively expected. One 
example is shown in Fig. 6, a mass reconstruction of a high-redshift cluster based on 
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Figure 6. Mass reconstruction (contours) of the inner part of the high redshift (zd = 0.83) 
cluster MS 1054— 03, based on a mosaic of six pointings obtained with the WFPC2C8HST (from 
Hoekstra et al. 2000). The splitting of the cluster core into three subcomponents, previously 
seen from ground-based images by Clowe et al. (2000), shows that this cluster is not yet relaxed 

HST data. The presence of three mass clumps, which coincide with three concentrations 
of cluster galaxies, indicates that this cluster is still in the process of merging. When 
the merging occurs along the line-of-sight, then it is less obvious in the mass maps. One 
example seems to be the cluster C10024+16, for which one obtains a large mass from 
the distance of its arcs from the cluster center (Colley et al. 1996), but which is fairly 
underluminous in X-rays for this mass. A detailed investigation of the structure of this 
cluster in radial velocity space by Czoske et al. (2002) has shown strong evidence for 
a collision of two clusters along the line-of-sight. Another example is provided by the 
cluster A1689, where the extended arc structures suggest an Einstein radius of about 
40" for this cluster, making this the strongest lensing cluster in the sky, but the weak 
lensing results do not support the enormous mass obtained from the arcs (see Clowe & 
Schneider 2001, King et al. 2002, and references therein). 

However, in many clusters the weak lensing mass estimates are in good agreement 
with those from dynamical estimates and X-ray determinations (e.g., Squires et al. 1996), 
provided the inner region of the clusters are omitted - but for them, the weak lensing 
method does not have sufficient angular resolution anyway. For example, Hoekstra and 
collaborators have observed three X-ray selected clusters with HST mosaics, and their 
results, summarized in Hoekstra et al. (2002a), shows that the SIS fit values for the 
velocity dispersion agree with those from spectrosopic investigations. 

The mass maps can also be used to study how well the cluster galaxy distribution traces 
the underlying dark matter. An HST data based mass reconstruction of C10939+47 (Seitz 
et al. 1996) shows detailed structure that is very well matched with the distribution of 
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bright cluster galaxies. A more quantitative investigation was performed by Wilson ct 
al. (2001) showing that early-type galaxies trace the dark matter distribution quite well. 

One of the predictions of CDM models for structure formation is that clusters of 
galaxies are located at the intersection points of filaments. In particular, this implies that 
a physical pair of clusters should be connected by a bridge or filament of (dark) matter, 
and weak lensing mass reconstructions can in principle be used to search for them. In 
the investigation of the z = 0.42 supercluster MS0302, Kaiser et al. (1998) found an 
indication of a possible filament connecting two of the three clusters, with the caveat (as 
pointed out by the authors) that the filament lies just along the boundary of two CCD 
chips. Gray et al. (2002) saw a filament connecting the two clusters A901A/901B in their 
mass recosntruction of the A901/902 supercluster field. One of the problems related to 
the unambiguous detection of filaments is the difficulty to define what a 'filament' is, i.e. 
to device a statistics to quantify the presence of a mass bridge. Because of that, it is 
difficult to distinguish between noise in the mass maps, the 'elliptical' extension of two 
clusters pointing towards each other, and a true filament. 

A perhaps surprising result is the difficulty of distinguishing between the NFW mass 
profile from, say power-law models, such as the isothermal profile. In fact, even from 
weak lensing observations covering large fields, out to the virial radius of clusters (Clowe 
& Schneider 2001, 2002; King et al. 2002), the distinction between NFW and isothermal 
is present only at the ^ 2c level. The reason for this is the mass-sheet degeneracy. 
Within a family of models (such as the NFW), the model parameters can be determined 
with fairly high accuracy. One way to improve on the distinction between various mass 
profiles is to incorporate strong lensing constraints into the mass reconstruction (e.g., 
using inverse methods; see below). In particular, the multiple images seen in the cores of 
clusters can be used to determine the central mass profile (see Sand et al. 2002; Gavazzi et 
al. 2003). On the other hand, one can statistically superpose weak lensing measurements 
of clusters to obtain their average mass profile, as done by Dahle et al. (2003) for six 
clusters; even in that case, a (generalized) NFW profile is hardly distinguishable from an 
isothermal model. 

5.4.4. Inverse methods 

In addition to these 'direct' methods for determing k, inverse methods have been 
developed, such as a maximum-likelihood fit (Bartelmann ct al. 1996) to the data. In 
these techniques, one parameterizes the lens by the deflection potential ip on a grid and 
then maximizes 



with respect to these gridded tp- values. In order to avoid overfitting, one needs a regular- 
ization; entropy regularization (Seitz et al. 1998) seems best suited. It should be pointed 
out that the deflection potential ip, and not the surface mass density k, should be used as 
a variable, for two reasons: first, shear and n depend locally on ip, and are thus readily 
calculated by finite differencing, whereas the relation between 7 and n is non-local and 
requires summation over all gridpoints. Second, and more importantly, the surface mass 
density on a finite field does not determine 7 on this field, since mass outside the field 
contributes to 7 as well. 

There are a number of reasons why inverse methods are in principle preferable to the 
direct method discussed above. First, in the direct methods, the smoothing scale is set 
arbitrarily, and in general kept constant. It would be useful to have an objective way 
how to choose this scale, and perhaps, the smoothing scale be a function of position: e.g., 




(5.12) 
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in regions with larger number densities of sources, the smoothing scale could be reduced. 
Second, the direct methods do not allow additional input coming from observations; for 
example, if both shear and magnification information are available, the latter could not 
be incorporated into the mass reconstruction. The same is true for clusters where strong 
lensing constraints are known. 

5.5. Aperture mass 

In the weak lensing regime, k < 1, the mass-sheet degeneracy corresponds to adding a 
uniform surface mass density Ko- We shall now consider a quantity, linearly related to 
k, that is unaffected by the mass-sheet degeneracy. Let U (\9\) be a compensated weight 
(or filter) function, with J c\9 9 U(9) = 0, then the aperture mass 

M ap (0 o ) = J d 2 K (e)u(\e-e Q \) (5.13) 

is independent of Ko, as can be easily seen. The important point to notice is that M ap 
can be written directly in terms of the shear (Schneider 1996) 



M ap (0 o ) = J d 2 6Q(\e\) lt (e ; e Q 



(5.14) 



where we have defined the tangential component 7t of the shear relative to the point 9q, 
and 

Q(0) = ^J°dO'0'U(9')-U(9). (5.15) 

These relations can be derived from (5.7), by rewriting the partial derivatives in polar 
coordinates and subsequent integration by parts. 

We shall now consider a few properties of the aperture mass. 

• If U has finite support, then Q has finite support. This implies that the aperture 
mass can be calculated on a finite data field. 

• If U(9) = const, for < 9 < 9 m , then Q(9) = for the same interval. Therefore, the 
strong lensing regime (where the shear 7 deviates significantly from the reduced shear g) 
can be avoided by properly choosing U (and Q). 

. If U{9) = (7T91)- 1 for < 9 < 9 in , U(9) = -[tt^ - ^J]" 1 for 9 in < 9 < 9 ouU 
and U = for 9 > 9 ouU then Q{9) = 9 2 out 9~ 2 [n(6 2 out - tf? n )] ^ for 9 in < 9 < out , and 
Q{9) = otherwise. For this special choice of U, 

M ap = R(9 in ) - R(9 in , 9 out ) , (5.16) 

the mean mass density inside 9 la minus the mean density in the annulus 9 m < 9 < 9 out . 
Since the latter is non-negative, this yields lower limit to R(9i n ), and thus to M(6>; n ). 



6. Cosmic shear — lensing by the LSS 

Up to now we have considered the lensing effect of localized mass concentrations, like 
galaxies and clusters. In addition to that, light bundles propagating through the Universe 
are continuously deflected and distorted by the gravitational field of the inhomogcneous 
mass distribution, the large-scale structure (LSS) of the cosmic matter field (the reader is 
referred to John Peacock's lecture for the definition of cosmological parameters and the 
theory of structure growth in the Universe). This distortion of light bundles causes shape 
distortions of images of distant galaxies, and therefore, the statistics of the distortions 
reflect the statistical properties of the LSS. 
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Cosmic shear deals with the investigation of this connection, from the measurement 
of the correlated image distortion to the inference of cosmological information from this 
distortion statistics. As we shall see, cosmic shear has become a very important tool 
in observational cosmology. From a technical point-of-view, it is quite challenging, first 
because the distortions are indeed very weak and therefore difficult to measure, and 
second, in contrast to 'ordinary' lensing, here the light deflection does not occur in a 
'lens plane' but by a 3-D matter distribution; one therefore needs a different description 
of the lensing optics. We start by looking at the description of light propagating through 
the Universe. 



6.1. Light propagation in an inhomogeneous Universe 

The laws of light propagation follow from Einstein's General Relativity; according to it, 
light propagates along the null-geodesics of the space-time metric. As shown in SEF, one 
can derive from General Relativity that the governing equation for the propagation of 
thin light bundles through an arbitrary space-time is the equation of geodesic deviation, 

d 2 £ 

J=Tt, (6.1) 

where £ is the separation vector of two neighboring light rays, A the affine parameter 
along the central ray of the bundle, and T is the optical tidal matrix which describes 
the influence of space-time curvature on the propagation of light. T can be expressed 
directly in terms of the Ricmann curvature tensor. 

For the case of a weakly inhomogeneous Universe, the tidal matrix can be explicitly 
calculated in terms of the Newtonian potential. For that, we write the slightly perturbed 
metric of the Universe in the form 

2$\ o . o /. 2$ N 

l2~ 



ds 2 = a 2 (r) 



1 + 



c 2 dr 2 - 1 - 



(dw 2 + f 2 K {w)du 2 ) 



(6.2) 



where w is the comoving radial distance, a = (1 + z)^ 1 the scale factor, normalized to 
unity today, r is the conformal time, related to the cosmic time t through dt — a dr, 
fn{w) is the comoving angular diameter distance, which equals w in a spatially flat 
model, and <f> denotes the Newtonian peculiar gravitational potential. In this metric, the 
equation of geodesic deviation yields, for the comoving separation vector x(0, w) between 
a ray separated by an angle 9 at the observer from a fiducial ray, the evolution equation 



d 2 x 



2 r 

~2 



(x(0, w),w)- V±$ (0) (w) 



(6.3) 



dw 2 

where K — (Ho/c) 2 (f2 m + Oa — 1) is the spatial curvature, Vj_ = (d/dxi, 8/8x2) is the 
transverse comoving gradient operator, and $>(°\w) is the potential along the fiducial 
ray. The formal solution of the transport equation is obtained by the method of Green's 
function, to yield 

o p» r -i 

x(8,w)= f K {w)8- — dw' f K (w - w') Vi<i> (x(0, 10'), w') - V^(f> (0) (10') .(6.4) 
c Jo L 

A source at comoving distance w with comoving separation x from the fiducial light ray 

would be seen, in the absence of lensing, at the angular separation (3 — x/ Jk{w) from 

the fiducial ray (this statement is nothing but the definition of the comoving angular 

diameter distance). Hence, in analogy with standard lens theory, we define the Jacobian 

matrix 

' " X (6.5) 



M0M = f e 



!k{w) de 
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and obtain 



2 f w 

Aij(0,w) = 5 i0 I dw 

c Jo 



, fn {w - w')f K (w') 
Ik{w) 



$ ife (x(0XW) A kj (d,w') , (6.6) 



which describes the locally linearized mapping introduced by LSS lensing. This equation 
still is exact in the limit of validity of the weak-field metric. Next, we expand A in powers 
of <f> and truncate the scries after the linear term: 



2 f w 

A i3 {e,w) = % / dw 

c Jo 



, fx (w - w')f K {w') 
Ik{w) 



(6.7) 



Hence, to linear order, the distortion can be obtained by integrating along the unper- 
turbed ray; this is also called the Born approximation. Corrections to the Born approx- 
imation are necessarily of order <f> 2 . If we now define the deflection potential 



i>(0,w) := — / dw 
c Jo 



, Ik{w - w')f K {w') 



(6. 



then Aij = 5ij — ipjj, just as in ordinary lens theory. In this approximation, lensing 
by the 3-D matter distribution can be treated as an equivalent lens plane with deflection 
potential tp, mass density n — \7 2 ip/2, and shear 7 = (ipsi — V',22)/2 + 1^,12- 

6.2. Cosmic shear: the principle 
6.2.1. The effective surface mass density 

Next, we relate k to the fractional density contrast S of matter fluctuations in the 
Universe; this is done in a number of steps: 

(a) Take the 2-D Laplacian of tp, and add the term $ j33 in the integrand; this latter 
term vanishes in the line-of-sight integration, as can be seen by integration by parts. 

(6) We make use of the 3-D Poisson equation in comoving coordinates 



V 2 $ 



to obtain 



dw 



, f K (w')fK(w-w') 6(f K (w')0,w') 



a(w') 



(6.9) 



(6.10) 



Note that k is proportional to fi m , since lensing is sensitive to Ap oc fl m S, not just to 
the density contrast S = Ap/p itself. 

(c) For a redshift distribution of sources with p z {z) dz — p w {w) dw, the effective sur- 
face mass density becomes 



k(0) = J dw p w (w)n(0,w) 



2c 2 



dw g(w)f K (w] 



5(f K (w)9,w) 
a(w) 



with 



9(w, 



, / , ,-,Jk{w -w) 
dw p w (w ) — ' 
Tk{W) 



(6.11) 



(6.12) 



which is essentially the source-redshift weighted lens efficiency factor Dd s /D s for a density 
fluctuation at distance w, and Wh is the comoving horizon distance. 

6.2.2. Limber's equation 

The density field 5 is assumed to be a realization of a random field. It is the properties 
of the random field that cosmologists can predict. In particular, the second-order sta- 
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tistical properties of the density field are described in terms of the power spectrum. We 
shall therefore look at the relation between the quantities relevant for lensing and the 
power spectrum. The basis of this relation is Limber's equation. If S is a homogeneous 
and isotropic 3-D random field, then the projections 



9i{0) 



dw qi(w)6(f K (w)6,w) 



(6.13) 



also are (2-D) homogeneous and isotropic random fields, where the are weight func- 
tions. In particular, the correlation function 

C12 = (51(^1)92(^2)) = Ci 2 (|Vi - V2I) (6-14) 

depends only on the modulus of the separation vector. The original form of the Limber 
equation relates C\i to the correlation function of <5 which is a line-of-sight projection. 
Alternatively, one can consider the Fourier-space version of this relation: The power 
spectrum P\i{€) - the Fourier transform of Cniff) - depends linearly on the power 
spectrum P$(k) of the density fluctuations (Kaiser 1992), 



J fk( w ) 



£ 



(6.15) 



if the largest-scale structures in S are much smaller than the effective range Aw of the 
projection. Hence, we obtain the (very reasonable) result that the power at angular 
scale \/l is obtained from the 3-D power at length scale /a-(iu) (1/^), integrated over w. 
Comparing (5.8) with (6.15), one sees that n(8) is such a projection of 6 with the weights 
3iM = 92M = (3/2)(H /c) 2 n m g(w)f K (w)/a(w), so that 



P K {i) = 



4c 4 



W ^ w ^t\ps 

a A (w) 



(6.16) 



The power spectrum P K , if obtained through observations, can therefore be used to 
constrain the 3-D power spectrum P$. 



6.3. Second- order cosmic shear measures 

As we shall see, all second-order statistics of the cosmic shear yield (filtered) information 
about P K . The most-often used second-order statistics are: 

• The two-point correlation function(s) of the shear, £±(0), 

• the shear dispersion in a (circular) aperture, (I7I 2 ) (0), and 

• the aperture mass dispersion, (M^ ) (6). 

These will be discussed next, and their relation to P K (£) shown. As a preparation, 
consider the Fourier transform of k, 



k{1) = j d 2 9c i£ k{0) ; 



(6.17) 



then, 



(k{£)k*{t)) = (2tt) 2 S u (£ - £') P K (£) , (6.18) 

which provides another definition of the power spectrum P K . The Fourier transform of 
the shear is 

' 02 - £ + 2uV 2 X 



\t\< 



k{£) 



(6.19) 
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which implies that 

<7(W(0> = (2rr) 2 S B (£-£')P K (£). (6.20) 
Hence, the power spectrum of the shear is the same as that of the convergence. 

6.3.1. Shear correlation functions 

Consider a pair of points (i.e., galaxy images); their separation direction ip (i.e. the po- 
lar angle of the separation vector 9) is used to define the tangential and cross-component 
of the shear at these positions for this pair, 

7t = -7ee( 7 e- 2 ^) , 7x = -Im (jc' 2 ^) . (6.21) 

Then, the shear correlation functions are defined as 

£±(0) = <7t7t)±<7x7x>(0) , 

£xW = <7t7x>(0). 

Due to parity symmetry, £ x (9) is expected to vanish, since under such a transformation, 
7t ~ ^ 7t; but 7>< — > — 7x- Next we relate the shear correlation functions to the power 
spectrum P K : Using the definition of £±, replacing 7 in terms of 7, and making use of 
relation between 7 and k, one finds: 



MO) 



d££ 
~2V 



d££ 

~2tT 



h(£0)P K (£) 



(6.22) 



£± can be measured as follows: on a data field, select all pairs of faint galaxies with 
separation within A9 of 9 and then take the average (eu £tj) over all these pairs; since 
e = e( s ) +7(0), the expectation value of (e t i ey) is (7t7t) (9), provided source ellipticities 
arc uncorrelated. Similarly, the correlation for the cross-components is obtained. 

6.3.2. The shear dispersion 

Consider a circular aperture of radius 9; the mean shear in this aperture is 7. Averaging 
over many such apertures, one defines the shear dispersion (I7I 2 ) (9). It is related to the 
power spectrum through 

" 4J?fa) 



170(0) 



1 

2^ 



A££P K {£)W T n{£6) 



where Wth(i]) 



rf 



(6.23) 



is the top-hat filter function. The shear dispersion can be measured by averaging the 
square of the mean galaxy ellipticities over many independent apertures. 

6.3.3. The aperture mass 

Consider a circular aperture of radius 9; for a point inside the aperture, define the 
tangential and cross-components of the shear relative to center of aperture (as before); 
then define 



M &p (9) = / d 2 t?Q(|0|)7t(0) , 



(6.24) 



where Q is a weight function with support ■& € [0, 1 



6 



, d 2 



9 2 



. In the following we shall use 
H(0 - 0) , 



in which case the dispersion of M ap (9) is related to the power spectrum by 



1 r°° 

(Ml p ){9) = — d££P K (£)W ap 



(Ot) 



with W ap (r)) := 



576J|(?7) 



(6.25) 
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6.3.4. Interrelations 

These various 2-point statistics all depend linearly on the power spectrum P K ; there- 
fore, one should not be too surprised that they are all related to each other. The surprise 
perhaps is that these interrelations are quite simple (Crittenden et al. 2002). First, the 
relations between £± and P K can be inverted, making use of the orthonormality relation 
of Bessel functions: 



P K (£) = 2nj^ d06>£+(6>) J a (£9) = 2tt / d00{_(0) J ,(W 



(6.26) 



Next, we take one of these and plug it into the relation between the other correlation 
function and P K , to find: 



rOQ j o 


f e 2 \ 









Using (6.26) in the expression for the shear dispersion, one finds 





(l7l>) = f7f+ WS+ 


3) 


-1:7^- 


(?) 






where the S± are simple functions, given explicitly in Schneider et al. (2002a 
the same procedure for the aperture mass dispersion lets us write 




<M, 2 p)(») = / 2S ^s' + WT + 






(?) 




i 



(6.27) 



(6.28) 



(6.29) 



(6.30) 



again with known functions T± (Schneider et al. 2002a). Hence, all these 2-point statistics 
can be evaluated from the correlation functions £±(9), which is of particular interest, 
since they can be measured best: Real data fields contain holes and gaps (like CCD 
defects, brights stars, nearby galaxies, etc.) which makes the placing of apertures difficult; 
however, the evaluation of the correlation functions is not affected by gaps, as one uses 
all pairs of galaxy images with a given angular separation. 

6.4. Cosmic shear and cosmology 

6.4.1. Why cosmology from cosmic shear? 

Before continuing, it is worth to pause for a second and ask the question as to why one 
tries to investigate cosmological questions by using cosmic shear - since the CMB can 
measure cosmological parameters with high accuracy. Partial answers to this question 
are: 

• Cosmic shear measures the mass distribution at much lower redshifts (z ^ 1) and 
at smaller physical scales [R ~ 0.3 (0/1') Mpc] than the CMB; indeed, it is the only 
way to map out the dark matter distribution directly without any assumptions about the 
relation between dark and baryonic matter. The smaller scales probed are very important 
for constraining the shape of the power spectrum, i.e., the primordial tilt and the shape 
parameter r spoct . 

• Cosmic shear measures the non-linearly evolved mass distribution and its associated 
power spectrum P$(k); hence, in combination with the CMB it allows us to study the 
evolution of the power spectrum and in particular, provides a very powerful test of the 
gravitational instability paradigm for structure growth. 
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• It provides a fully independent way to probe the cosmological model. Given the 
revolutionary claims coming from the CMB, SN la, and the LSS of the galaxy distribu- 
tion, namely that more than 95% of the stuff in the Universe is in a form about whose 
physical nature we have not the slightest idea, an additional independent verification of 
these claims is certainly welcome. 

• As we shall see shortly, cosmic shear studies provide a new and highly valuable 
search method for cluster-scale matter concentrations. 

6.4.2. Expectations 

The cosmic shear signal depends on the cosmological model, parametrized by f2 m , Qa, 
and the shape parameter r spe ct of the power spectrum, the normalization of the power 
spectrum, usually expressed in terms of as, and the redshift distribution of the sources. 
By measuring £± over a significant range of angular scales one can derive constraints on 
these parameters. 

The accuracy with which £± can be measured depends on number density of galaxies 
(that is, depth and quality of the images), the total solid angle covered by the survey, 
and its geometric arrangement (compact survey vs. widely separated pointings); it is de- 
termined by a combination of intrinsic cllipticity dispersion and the cosmic (or sampling) 
variance. For angular scales below about 1 degree, the non-linear evolution of the power 
spectrum becomes important for the cosmic shear signal; because of this, the expected 
signal is considerably larger than estimated from linear perturbation theory of structure 
evolution. Furthermore, the signal depends quite strongly on the mean redshift of the 
source galaxies, which suggests that deep surveys, aiming for higher-redshift galaxies, are 
best suited for cosmic shear studies. 



6.5.1. First detections 

Whereas the theory of cosmic shear was worked out in the early 1990's (Blandford 
et al. 1991; Miralda-Escude 1991; Kaiser 1992), it took until the year 2000 before this 
effect was first discovered. The reason for this time lag must be seen as a combination 
of instrumental developments, i.e. the wide-field CCD mosaic cameras, and the image 
analysis software, like IMCAT, with which shapes of galaxies can be corrected for PSF 
effects. Finally, in March 2000, four groups independently published their first discov- 
eries of cosmic shear (Bacon et al. 2000; Kaiser et al. 2000; van Waerbeke et al. 2000, 
Wittman et al. 2000). In these surveys, of the order of 10 5 galaxy images have been 
analyzed, covering about 1 deg 2 . The fact that the results from these four independent 
teams agreed within the respective error bars immediately gave credence to this new 
window of observational cosmology. Furthermore, 4 different telescopes, 5 different cam- 
eras, independent data reduction tools and at least two different image analysis methods 
have been used in these studies. Maoli et al. (2001) reported a significant cosmic shear 
measurement from 50 widely separated FORSl@VLT images, which also agreed with the 
earlier measurements. 

6.5.2. Deriving constraints 

From the measured correlation functions £±(9), obtaining constraints on cosmological 
parameters can proceed through minimizing 



6.5. Observation of cosmic shear 
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Figure 7. Constraints in the f2 m - erg parameter plane, from the VIRMOS-DESCART survey 
(van Waerbeke et al. 2001). For this figure, in which the 1, 2, and 3-er confidence regions are 
indicated, a zero cosmological constant has been assumed, the redshift distribution of the source 
galaxies was assumed to be known, as well as the shape parameter r spC ct = 0.21 

with £j = S,(0i) being the binned correlation function(s) (i.e., either or using both), p 
is a set of cosmological parameters, and Cov^ 1 the inverse covariance matrix. The latter 
can be determined either from the £± itself, from simulations, or estimated from the 
data (see Schneider et al. 2002b) . Nevertheless, the calculation of the covariance is fairly 
cumbersome, and most authors have used approximate methods to derive it, such as the 
field-to-field variations of the measured correlation. As it turns out, £+(0) is strongly 
correlated across angular bins, much less so for £-(#); this is due to the fact that the 
filter function that describes £ in terms of the power spectrum P K is much broader for 
£ + (namely Jo) than J 4 which applies for Of course, a corresponding figure-of-merit 
function can be defined for the other second-order shear statistics, with their respective 
covariance matrices, but as argued before, the correlation functions should be regarded 
as the basic observable statistics. 

6.5.3. Recent results 

Since the first detections of cosmic shear, described above, there have been a large 
number of measurements over the past three years. Instead of mentioning them all here, 
we refer the reader to the recent reviews by van Waerbeke & Mellier (2003) and Refregier 
(2003). State-of-the-art are deep surveys, similar to those with which the first cosmic 
shear results have been derived, but with significantly larger solid angle (van Waerbeke 
et al. 2001, 2002), or shallower surveys of much larger area (e.g., Hoekstra et al. 2002c; 
Jarvis et al. 2003). The results of these surveys, which contain of the order of ~ 10 6 
galaxies, i.e., an order-of-magnitude more than the discovery surveys mentioned above, 
can be summarized roughly as follows: 

Cosmic shear by itself presently does not provide strong constraints on multi-dimensional 
cosmological parameter space. Hence, if one does not fix most of the cosmological pa- 
rameters from external sources, the allowed region in multi-dimensional parameter space 
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is still quite large. On the other hand, if one considers a restricted set of parameters, 
cosmic shear results are very powerful. An example of that is given in Fig. 7, where 
all cosmological parameters have been kept fixed, except f2 m and the normalization cr 8 . 
In this case, one finds indeed a well-defined maximum of the corresponding likelihood. 
Interestingly, the direction of the 'likelihood valley' nearly coincides with the constraint 
obtained from the cluster abundance, i.e., it provides a constraint on cr 8 f^ 6 . If the other 
cosmological parameters are not assumed to be known precisely, but are marginalized 
over a plausible uncertainty range, the likelihood contours widen substantially. 

When combined with results from other methods, cosmic shear yields very useful 
information. For example, as pointed out by Hu & Tegmark (1999), when combined with 
data from CMB anisotropy, cosmic shear can break degeneracies of model parameters 
which are present when using the CMB data alone. In the f2 m - as parameter plane, 
cosmic shear constraints are nearly perpendicular to those from the CMB (van Waerbeke 
et al. 2002). 

Therefore, at present the best use of cosmic shear results is in constraining the normal- 
ization £78 of the density perturbations, for a set of other cosmological parameters fixed 
by other methods, such as the CMB, galaxy redshift surveys, etc. The various cosmic 
shear surveys have given a range of as determinations which is about as narrow as current 
estimates from the abundance of massive clusters (see van Waerbeke & Mellicr 2003 for 
a summary of these results) . Given the youth of this field, this indeed is a remarkable 
achievement already. Furthermore, since the determination of as from cluster abundance 
and cosmic shear agree, one learns something important: the cluster abundance depends 
on the assumed Gaussianity of the primordial density field, whereas the constraint from 
cosmic shear does not. Hence, the agreement between the two methods supports the 
idea of an initial Gaussian field. Without doubt, the next generation of cosmic shear sur- 
veys will provide highly accurate determinations of this normalization, as well as other 
(combinations of) cosmological parameters. 



In the derivation of the lensing properties of the LSS, we ended up with an equivalent 
surface mass density. In particular, this implied that A is a symmetric matrix, that the 
shear can be obtained in terms of k or ip. Now, the shear is a 2-component quantity, 
whereas both k and tp are scalar fields. This implies that the two shear components are 
not independent of each other! 

Recall that (5.7) yields a relation between the gradient of k and the first derivatives of 
the shear components; in particular, (5.7) implies that V x u 7 = 0, yielding a local con- 
straint relation between the shear components. The validity of this constraint equation 
guarantees that the imaginary part of (5.4) vanishes. This constraint is also present at 
the level of 2-point statistics, since one expects from (6.26) that 



Hence, the two correlation functions £± are not independent. The observed shear field is 
not guaranteed to satisfy these relations, due to noise, remaining systematics, or other 
effects. Therefore, searching for deviations from this relation allows a check for these 
effects. However, there might also be a 'shear' component present that is not due to 
lensing (by a single equivalent thin matter sheet k). Shear components which satisfy the 
foregoing relations are called E-modes; those which don't are B-modes - these names are 
exported from the polarization of the CMB, which has the same mathematical properties 
as the shear field. 



6.6. E-modes, B-modes 




(6.32) 
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Figure 8. Dispersion of the aperture mass (upper row) and its analogue (Af±) for the 

cross-component, as obtained from the Red Cluster Sequence survey (Hoekstra et al. 2002). The 
left panels show the results for a broad range of galaxy brightnesses, whereas the middle and 
right rows display the results for the bright and the fainter parts, respectively, of the sample. 
Clearly, the presence of the B-mode is seen; its strength decreases for the fainter part of the 
sample; this behavior is expected if the B-mode is due to intrinsic alignments of galaxies. Its 
relative importance decreases with increasing width of the redshift distribution of galaxies 



The best way to separate these modes locally is provided by the aperture measures: 
(Mf p (0)) is sensitive only to E-modes. If one defines in analogy 

M ± {6) = J d 2 0Q(|0|)7xW , (6-33) 

then (M±(6)) is sensitive only to B-modes. 

Significant B-modes have been discovered in cosmic shear surveys (e.g., van Waerbeke 
et al. 2002; Hoekstra et al. 2002 - see Fig. 8); the question now is what are they due to? 
As mentioned before, the noise, which contributes to both E- and B-modes in similar 
strengths, could be underestimated, there could be remaining systematic effects, or indeed 
show the real presence of a B-mode on the sky. There are two possibilities known to 
generate a B-mode through lensing: The first-order in $ (or 'Born') approximation may 
not be strictly valid, but as shown by ray-tracing simulations through cosmic matter fields 
(e.g., Jain et al. 2000) the resulting B-modes are expected to be very small. Clustering 
of sources also yields a finite B-mode (Schneider et al. 2002a), but again, this effect is 
much smaller than the observed amplitude of the B-modes. 

Currently the best guess for the generation of a finite B-mode are intrinsic correlations 
of galaxy ellipticities. Such intrinsic alignments of galaxy ellipticities can be caused by 
the tidal gravitational field of the large-scale structure at galaxy formation. Predictions 
of the alignment of the projected ellipticity of the galaxy mass can be made analytically 
(e.g. tidal torque theory) or from numerical simulations; however, the predictions from 
various groups differ by large factors (e.g., Croft & Metzler 2000; Crittenden et al. 2001; 
Heavens et al. 2000; Jing 2002) which means that the process is not well understood at 
present. In addition, there remains the question of whether the orientation of the galaxy 
light (which is the issue of relevance here) is the same as that of the mass. 
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If intrinsic alignments play a role, then 

e + = (e i e*) = (ef ) e( s) *}+^ ns , (6.34) 

and measured correlations £± contain both components. Of course, there is no reason 
why intrinsic correlations should have only an B-mode. If a B-modc contribution is 
generated through this process, then the measured E-mode is also contaminated by 
intrinsic alignments. In fact, the various models do not agree on the relative strength of 
E- and B-modes in the intrinsic alignments of galaxies, but it seems that the E-modes 
have generally higher amplitude than the B-modes. Given that intrinsic alignments yield 
cllipticity correlations only for spatially close sources (i.e., close in 3-D, not merely in 
projection), it is clear that the deeper a cosmic shear survey is, and thus the broader the 
redshift distribution, the smaller is the relative amplitude of an intrinsic signal. Most 
of the theoretical predictions on the strength of intrinsic alignments say that the deep 
cosmic shear surveys (say, with mean source redshifts of (z s ) ~ 1) are affected at a ~ 10% 
level, but that shallow cosmic shear surveys are more strongly affected; for them, the 
intrinsic alignment can be of the same order as, or larger than the lensing signal. 

However, the intrinsic signal can be separated from the lensing signal if redshift infor- 
mation of the sources is available, owing to the fact that (^ef^ e^*^ will be non-zero only 

if the two galaxies are at the same redshift. Hence, if z-information is available (e.g., 
photometric redshifts), then galaxy pairs which are likely to have similar redshifts are 
to be avoided in estimating the cosmic shear signal (King & Schneider 2002; Heymans 
& Heavens 2003). This will change the expectation value of the shear correlation func- 
tion, but in a controlable way, as the redshifts are assumed to be known. Indeed, using 
(photometric) redshifts, one can simultaneously determine the intrinsic and the lensing 
signal, essentially providing a cosmic shear tomography (King & Schneider 2003). 

6.7. Higher-order statistics 

On the level of second-order statistics, 'only' the power spectrum is probed. If the density 
field was Gaussian, then the power spectrum would fully characterize it; however, in the 
course of non-linear structure evolution, non-Gaussian features of the density field are 
generated, which show up correspondingly in the cosmic shear field and which can be 
probed by higher-order shear statistics. The usefulness of these higher-order measures 
for cosmic shear has been pointed out in Bernardeau et al. (1997) and van Waerbeke 
et al. (1999); in particular, the near-degeneracy between cr 8 and Q m can be broken. 
However, these are serious problems with higher-order shear statistics, that shall be 
illustrated in terms of the third-order statistics. The three-point correlation function 
has three independent variables (e.g. the sides of a triangle) and 8 components; as was 
shown in Schneider & Lombardi (2003), none of these eight components vanishes owing 
to parity invariance. This then implies that the covariance matrix has 6 arguments and 
64 components! Of course, this is too difficult to handle efficiently, and therefore one 
must ask which combinations of the components of the 3-pt correlation function are 
most useful for studying the dark matter distribution. Unfortunately, this is essentially 
unknown yet. An additional problem is that the predictions from theory are less well 
established than for the second-order statistics. 

Nevertheless, progress has been made. From ray-tracing simulations through a cos- 
mic matter distribution, the 3-pt correlation function of the shear can be determined 
(Takada & Jain 2003; see also Zaldarriaga & Scoccimarro 2003); in addition, Schneider 
& Lombardi (2003) have defined the 'natural components' of the 3-pt correlator which 
are most easily related to the bispectrum of the underlying matter distribution. 
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Alternatively, aperture measures can be defined to measure the third-order statistics. 
Schneider et al. (1998) calculated (Mf p ) (9) in the frame of the quasi-linear structure 
evolution model and showed it to be a strong function of O m . Indeed, (M% p ) is sensitive 
only to the E-modes of the shear field. One might be tempted to use (Mj_} (9) as a 
measure for third-order B-mode statistics, but indeed, this quantity vanishes owing to 
parity invariance. However, (M]_ M ap ) is a measure for the B-modes at the third-order 
statistical level. Bernardeau et al. (2002) measured for the first time a significant 3-rd 
order shear from the VIRMOS-DESCART survey, employing a suitably filtered integral 
over the measured 3-pt correlation function. With the upcoming large cosmic shear 
surveys, the 3-pt function will be measured with high accuracy. 

6.8. Weak lensing search for cluster-mass dark halos 
As we have seen, the mass distribution of clusters of galaxies can be mapped by weak 
lensing techniques. In fact, the coherent alignment of background galaxy images clearly 
shows the presence of a massive matter concentration present at or near the location of 
the optically or X-ray selected cluster towards which the weak lensing observations were 
targeted. As pointed out in Schneider (1996), one can use weak lensing to search for 
clusters: seeing a strong alignment of galaxy images centered onto a point on a wide-field 
image, one would conclude the presence of the mass concentration there. A very useful 
way to quantify this is the aperture mass statistics, already introduced. By selecting an 
appropriate filter function, one can systematically seach for statistically significant peaks 
of M ap on wide-field images. In fact, the data needed for this investigation is the same 
as that used in cosmic shear surveys. 

Since clusters of galaxies are very important cosmological probes, e.g., to determine 
the normalization of the power spectrum of the matter inhomogeneities in the Universe, 
a selection of clusters based on their mass properties only would be extremely useful. 
Usually, clusters are selected by their optical or X-ray properties; to transform lumi- 
nosity or X-ray temperature into a mass estimate, and thus to transform a flux-limited 
cluster sample into a mass-limited sample, which can then be compared to cosmological 
predictions, one needs to employ a number of approximations and scaling relations. In 
contrast to this, the shear selection can directly be compared to cosmological predictions, 
e.g., by calculating the abundance of peaks of M ap directly from N-body simulations of 
structure formation (e.g., Reblinsky et al. 1999), without reliance on the luminous prop- 
erties of baryonic matter, nor even for identifying cluster-mass halos in the simulated 
density fields. 

The abundance of peaks above a given threshold M ap , at a given angular scale, can 
also be used as a cosmic shear measure. In fact, in future large cosmic shear surveys 
this will become most likely one of the most useful statistics for studying non-Gaussian 
aspects of the shear field. Within the frame of Press-Schechter theory, Krusc & Schneider 
(1999) calculated the M ap peak statistics, which was then compared with direct numerical 
simulation by Reblinsky et al. (1999). White et al. (2002) pointed out that the M ap 
statistics can be substantially affected by the large-scale structure along the line-of-sight 
to these mass concentrations; this implies that the relation between M ap and the mass 
of the clusters is not simple but again, this method does not require a mass function 
to be determined, as the M ap -statistics can be obtained directly from LSS simulations. 

Several clusters, or cluster candidates, have been found that way. Erben et al. (2000) 
detected a highly significant shear signal corresponding to a putative mass peak, about 
7' away from the cluster Abell 1942, seen on two images taken with different filters and 
different cameras. No obvious concentration of galaxies is seen in this direction, neither 
in the optical nor near-IR images (Gray et al. 2001), making it a candidate for a 'dark 
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clump'; however, before drawing this conclusion, further investigations are needed, such 
as imaging with the HST to confirm the shear measurements. Umetsu & Futamase (2000) 
found a significant mass concentration on an HST image, again without an obvious optical 
counterpart. In contrast to this, Mellier et al. (2000) found a mass peak in one of their 
50 VLT fields taken for a cosmic shear survey with FORS, which is clearly associated 
with a concentration of galaxies. Wittman et al. (2001, 2002) detected two clusters on 
their wide-field images, and confirmed them spectroscopically. In fact, making use of 
photometric rcdshift estimates of the background galaxies, and employing the redshift 
dependence of the lens strength, they were able to estimate rather precisely the cluster 
rcdshifts, which were later confirmed with spectroscopy. Two of the three putative mass 
concentrations found by Dahle et al. (2002) are also very likely to be associated with 
luminous clusters. Hence, the shear selection of clusters has already been proven as a 
very useful concept. 

6.9. Lensing in three dimensions 

Using (photometric) redshift estimates in cosmic shear research is not only useful to 
remove the potential contribution from intrinsic alignments of galaxy cllipticities. If one 
defines galaxy populations with different redshift distributions, one can probe different 
projections of the cosmic density field; see (5.8). This then increases the information 
one can extract from a cosmic shear survey, and thus the ability to discriminate between 
different cosmological models (Hu 1999). 

More recently, it was pointed out (Taylor 2001; Hu & Keeton 2002) that the use of 
redshift information can in principle be used to reconstruct the three-dimensional density 
field 5 from the shear measurements. This is based on the possibility to invert (5.8), i.e., 
to express 5(w) in terms of k(w). The study of the 3-dimensional mass distribution is 
particularly interesting for constraining the properties of the dark energy in the Universe 
(e.g., Heavens 2003). 

7. Conclusions 

Due to its insensitivity to the nature of matter causing the gravitational potential, 
gravitational lensing has turned out to be an ideal tool to probe the structure of the 
(dark) matter distribution in the Universe, from small to large scales. Progress in this 
field has been very rapid in the past years, and due to the fast pace at which new 
instruments become available, it is guaranteed to continue its role as an important tool 
for observational cosmology. For example, the first images from the new camera ACS are 
breathtaking and will certainly lead to much improved mass models of clusters of galaxies. 
The new square degree optical cameras will provide cosmic shear surveys covering an 
appreciable fraction of the sky - as a consequence, statistical uncertainties and cosmic 
variance will then no longer be the main contribution to the error budget, but systematic 
effects in estimating shear from CCD images may well take over. The Next Generation 
Space Telescope will provide a superb tool for studying galaxy-scale lens systems, as well 
as clusters. 
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